Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
CalHacks 8 Repo: Megha Jain, Gaurav Bhatnagar, Howard Meng, Vibha Tantry
CalHacks8 CalHacks 8 Repo: Megha Jain, Gaurav Bhatnagar, Howard Meng, Vibha Tantry Setup FE Install React Native via Expo, run App.js. Backend Create
Dev-meme - A repository that contains memes just for people like us
A repository that contains memes just for people like us. Coders are constantly
Pampy: The Pattern Matching for Python you always dreamed of.
Pampy: Pattern Matching for Python Pampy is pretty small (150 lines), reasonably fast, and often makes your code more readable and hence easier to rea
Estimating the potential photovoltaic production of buildings (in Berlin)
The following people contributed equally to this repository (in alphabetical order): Daniel Bumke JJX Corstiaen Versteegh This repository is forked on
A simply program to find active jackbox.tv game codes
PeepingJack A simply program to find active jackbox.tv game codes How does this work? It uses a threadpool to loop through all possible codes in a ran
Trackthis - This library can be used to track USPS and UPS shipments.
Trackthis - This library can be used to track USPS and UPS shipments. It has the option of returning the raw API response, or optionally, it can be used to standardize the USPS and UPS responses so t
Pypot ⚙️ A Python library for Dynamixel motor control
Pypot ⚙️ A Python library for Dynamixel motor control Pypot is a cross-platform Python library making it easy and fast to control custom robots based
My programming language named JoLang. (Mainly created for fun)
JoLang status: not ready So this is my programming language which I decided to name 'JoLang' (inspired by Jonathan and GoLang). Features I implemented
Python decorator for `TODO`s
Python decorator for `TODO`s. Don't let your TODOs rot in your python projects anymore !
A male and female dog names python package
A male and female dog names python package
Mute your mic while you're typing. An app for Ubuntu.
Hushboard Mute your microphone while typing, for Ubuntu. Install from kryogenix.org/code/hushboard/. Installation We recommend you install Hushboard t
Back-end API for the reternal framework
RE:TERNAL RE:TERNAL is a centralised purple team simulation platform. Reternal uses agents installed on a simulation network to execute various known
This is a modified variation of abhiTronix's vidgear. In this variation, it is possible to write the output file anywhere regardless the permissions.
Info In order to download this package: Windows 10: Press Windows+S, Type PowerShell (cmd in older versions) and hit enter, Type pip install vidgear_n
Show Public IP Information In Linux Taskbar
IP Information In Linux Taskbar 📍 How Use IP Script? 🤔 Download ip.py script and save somewhere in your system. Add command applet in your taskbar a
This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python
PyJava This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python
Project for viewing the cheapest flight deals from Netherlands to other countries.
Flight_Deals_AMS Project for viewing the cheapest flight deals from Netherlands to other countries.
A small C compiler written in Python for learning purposes
A small C compiler written in Python. Generates x64 Intel-format assembly, which is then assembled and linked by nasm and ld.
A Lynx that manages a group that puts the federation first.
Lynx Super Federation Management Group Lynx was created to manage your groups on telegram and focuses on the Lynx Federation. I made this to root out
Unofficial Valorant documentation and tools for third party developers
Valorant Third Party Toolkit This repository contains unofficial Valorant documentation and tools for third party developers. Our goal is to centraliz
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Streamify A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more! Description Objective The project will stre