Python MapReduce library written in Cython.

Related tags

Miscellaneoushadoopy
Overview
Brandyn White 
  
Andrew Miller 
  
   

Source  
   https://github.com/bwhite/hadoopy/
Issues  
   https://github.com/bwhite/hadoopy/issues
Docs    
   http://bwhite.github.com/hadoopy/

IRC: #hadoopy @ freenode.net

Requirements
python development headers (python-dev), build tools (build-essential)

Optional
cython (>=.13) (without this it falls back to the pregenerated .c files)

Features
- oozie support
- Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch)
- typedbytes support (very fast)
- Local execution of unmodified MapReduce job with launch_local
- Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb)
- Works on OS X
- Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr)
- critical path is in Cython
- works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree)
- Simple HDFS access (readtb and ls) inside Python, even inside running jobs
- Unit test interface
- Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy)
- Supports design patterns in the Lin/Dyer book (
   http://www.umiacs.umd.edu/~jimmylin/book.html)

Limitations
- Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode.  Use psuedo-distributed instead for now.  (
   https://github.com/bwhite/hadoopy/issues/40)

Used in
- A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11)
- Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10)
- Vitrieve: Visual Search engine
- Picarus: Hadoop computer vision toolbox

Ubuntu Install (others are similar)
sudo apt-get install python-dev build-essential
sudo python setup.py install
  
 
Owner
Brandyn White
Brandyn White
Utility to play with ADCS, allows to request tickets and collect information about related objects

certi Utility to play with ADCS, allows to request tickets and collect information about related objects. Basically, it's the impacket copy of Certify

Eloy 185 Dec 29, 2022
Python Monopoly Simulator

Monopoly simulator Original creator: Games Computer Play YouTube: https://www.youtube.com/channel/UCTrp88f-QJ1SqKX8o5IDhWQ Config file (optional) conf

Games Computers Play 37 Jan 03, 2023
Python Freecell Solver

freecell Python Freecell Solver Very early version right now. You can pick a board by changing the file path in freecell.py If you want to play a game

Ben Kaufman 1 Nov 26, 2021
Python script that automates the tasks involved in starting a new coding project

Auto Project Builder Automates the repetitive tasks while starting a new project Installation Use the REQUIREMENTS.txt file to install the dependencie

Prathap S S 1 Feb 03, 2022
Set of scripts that schedules employees for shifts throughout the week based on availability, shift times, and shift necessities

Automatic-Scheduler Set of scripts that schedules employees for shifts throughout the week based on availability, shift times, and shift necessities *

Matthew 1 May 01, 2022
Create standalone, installable R Shiny apps using Electron

WARNING This is still very much a work in progress and nothing can be assumed stable in any way Temp notes: Two types of created installer, based on w

Chase Clark 5 Dec 24, 2021
The Official interpreter for the Pix programming language.

The official interpreter for the Pix programming language. Pix Pix is a programming language dedicated to readable syntax and usability Q) Is Pix the

Pix 6 Sep 25, 2022
A conda-smithy repository for boost-histogram.

The official Boost.Histogram Python bindings. Provides fast, efficient histogramming with a variety of different storages combined with dozens of composable axes. Part of the Scikit-HEP family.

conda-forge 0 Dec 17, 2021
A tool to help you to do the monthly reading requirements

Monthly Reading Requirement Auto ⚙️ A tool to help you do the monthly reading requirements Important ⚠️ Some words can't be translated Links: Synonym

Julian Jauk 2 Oct 31, 2021
🍕 A small app with capabilities ordering food and listing them with pub/sub pattern

food-ordering A small app with capabilities ordering food and listing them. Prerequisites Docker Run Tests docker-compose run --rm web ./manage.py tes

Muhammet Mücahit 1 Jan 14, 2022
Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting

Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting

Junhua Zou 7 Oct 20, 2022
Python implementation of the ASFLIP advection method

This is a python implementation of the ASFLIP advection method . We would like to hear from you if you appreciate this work.

Raymond Yun Fei 133 Nov 13, 2022
A python script for osu!lazer rulesets auto update.

osu-lazer-rulesets-autoupdater A python script for osu!lazer rulesets auto update. How to use: 如何使用: You can refer to the python script. The begining

3 Jul 26, 2022
A class to draw curves expressed as L-System production rules

A class to draw curves expressed as L-System production rules

Juna Salviati 6 Sep 09, 2022
Get you an ultimate lexer generator using Fable; port OCaml sedlex to FSharp, Python and more!

NOTE: currently we support interpreted mode and Python source code generation. It's EASY to compile compiled_unit into source code for C#, F# and othe

Taine Zhao 15 Aug 06, 2022
The first Python 1v1.lol triggerbot working with colors !

1v1.lol TriggerBot Afin d'utiliser mon triggerbot, vous devez activer le plein écran sur 1v1.lol sur votre naviguateur (quelque-soit ce dernier). Vous

Venax 5 Jul 25, 2022
Architecture example simulator

SCADA architecture Example of a SCADA-like console application, used to serve as a minimal example of a standard architecture of an IIoT system. Insta

1 Nov 06, 2021
Nextstrain build targeted to Omicron

About This repository analyzes viral genomes using Nextstrain to understand how SARS-CoV-2, the virus that is responsible for the COVID-19 pandemic, e

Bedford Lab 9 May 25, 2022
My Dotfiles of Arco Linux

Arco-DotFiles My Dotfiles of Arco Linux Apps Used Htop LightDM lightdm-webkit2-greeter Alacritty Qtile Cava Spotify nitrogen neofetch Spicetify Thunar

$BlueDev5 6 Dec 11, 2022