Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
Learn the basics of Python. These tutorials are for Python beginners. so even if you have no prior knowledge of Python, you won’t face any difficulty understanding these tutorials.
01_Python_Introduction Introduction 👋 Python is a modern, robust, high level programming language. It is very easy to pick up even if you are complet
Create or join a private chatroom without any third-party middlemen in less than 30 seconds, available through an AES encrypted password protected link.
PY-CHAT Create or join a private chatroom without any third-party middlemen in less than 30 seconds, available through an AES encrypted password prote
Tools for teachers and students using nng (Natural Number Game)
nngtools Usage Place your nngsave.json to the directory in which you want to extract the level files. Place nngmap.json on the same directory. Run nng
Data on COVID-19 (coronavirus) cases, deaths, hospitalizations, tests • All countries • Updated daily by Our World in Data
COVID-19 Dataset by Our World in Data Find our data on COVID-19 and its documentation in public/data. Documentation Data: complete COVID-19 dataset Da
Gobigger Explore For Python
Gobigger-Explore 🔮 GoBigger Challenge 2021 Baseline en/中文 🤖 Introduction This is the baseline of GoBigger Multi-Agent Decision Intelligence Challeng
Python / C++ based particle reaction-diffusion simulator
ReaDDy (Reaction Diffusion Dynamics) is an open source particle based reaction-diffusion simulator that can be configured and run via Python. Currentl
Python plugin for Krita that assists with making python plugins for Krita
Krita-PythonPluginDeveloperTools Python plugin for Krita that assists with making python plugins for Krita Introducing Python Plugin developer Tools!
pyshell is a Linux subprocess module
pyshell A Linux subprocess module, An easier way to interact with the Linux shell pyshell should be cross platform but has only been tested with linux
A compilation of useful scripts to automate common tasks
Scripts-To-Automate-This A compilation of useful scripts for common tasks Name What it does Type Add file extensions Adds ".png" to a list of file nam
An easy FASTA object handler, reader, writer and translator for small to medium size projects without dependencies.
miniFASTA An easy FASTA object handler, reader, writer and translator for small to medium size projects without dependencies. Installation Using pip /
Reactjs web app written entirely in python, using transcrypt compiler.
Reactjs web app written entirely in python, using transcrypt compiler.
BinCat is an innovative login system, with which the account you register will be more secure.
BinCat is an innovative login system, with which the account you register will be more secure. This project is inspired by a conventional token system.
Refer'd Resume Scanner
Refer'd Resume Scanner I wanted to share a free resource we built to assist applicants with resume building. Our resume scanner identifies potential s
create cohort visualizations for a subscription business
pycohort The main revenue generator for subscription businesses is recurring payments. There might be additional one-time offerings but the number of
My solutions for the 2021's Advent of Code
Advent of Code 2021 My solutions for Advent of Code 2021. This year I am practicing Python 🐍 and also trying to develop my own language, Chocolate 🍫
API Rate Limit Decorator
ratelimit APIs are a very common way to interact with web services. As the need to consume data grows, so does the number of API calls necessary to re
🦠 A simple and fast (< 200ms) API for tracking the global coronavirus (COVID-19, SARS-CoV-2) outbreak.
🦠 A simple and fast ( 200ms) API for tracking the global coronavirus (COVID-19, SARS-CoV-2) outbreak. It's written in python using the 🔥 FastAPI framework. Supports multiple sources!
SQL centered, docker process running game
REQUIREMENTS Linux Docker Python/bash set up image "docker build -t game ." create db container "run my_whatever/game_docker/pdb create" # creating po
A collection of useful functions for writers to analyze text/stories.
AuthorTools AuthorTools provides a multitude of functions for easily analyzing (your?) writing. AuthorTools is made especially for creative writers wi
Plux - A dynamic code loading framework for building plugable Python distributions
Plux plux is the dynamic code loading framework used in LocalStack. Overview The