Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
Simple Assembler with python
Assembler with python converts assembly source code to machine code Requirements Python 3 🐍 Usage python main.py [source] [output] [source] : Path t
A script that will warn you, by opening a new browser tab, when there are new content in your favourite websites.
web check A script that will warn you, by opening a new browser tab, when there are new content in your favourite websites. What it does The script wi
Python samples for Google Cloud Platform products.
Google Cloud Platform Python Samples Python samples for Google Cloud Platform products. Setup Install pip and virtualenv if you do not already have th
A basic animation modding workflow for FFXIV
AnimAssist Provides a quick and easy way to mod animations in FFXIV. You will need: Before anything, the VC++2012 32-bit Redist from here. Havok will
A free and open-source chess improvement app that combines the power of Lichess and Anki.
A free and open-source chess improvement app that combines the power of Lichess and Anki. Chessli Project Activity & Issue Tracking PyPI Build & Healt
a package that provides a marketstrategy for whitelisting on golem
filterms a package that provides a marketstrategy for whitelisting on golem watching requestor logs distribute 10 tasks asynchronously is fun. but you
A casual IDOR exploiter that provides .csv files of url and status code.
IDOR-for-the-casual Do you like to IDOR? Are you a Windows hax0r? Well have I got a tool for you... A casual IDOR exploiter that provides .csv files o
Курс "Искусственный интеллект и машинное обучение"
Искусственный интеллект и машинное обучение О курсе Данный репозиторий содержит в себе сопроводительный учебный материал для курса "Искусственный инте
Python Osmium Examples
Python Osmium Examples This is a set (currently of size 1) of examples showing practical usage of PyOsmium, a thin wrapper around the osmium library.
This repository contains the exercices for the robotics class at Supaero, 2022.
Supaero robotics, 2022 This repository contains the exercices for the robotics class at Supaero, 2022. The exercices are organized by notebook. Each n
Exercise to teach a newcomer to the CLSP grid to set up their environment and run jobs
Exercise to teach a newcomer to the CLSP grid to set up their environment and run jobs
Data types specify the different sizes and values that can be stored in the variable. For example, Python stores numbers, strings, and a list of values using different data types. Learn different types of Python data types along with their respective in-built functions and methods.
02_Python_Datatypes Introduction 👋 Data types specify the different sizes and values that can be stored in the variable. For example, Python stores n
An execution framework for systematic strategies
WAGMI is an execution framework for systematic strategies. It is very much a work in progress, please don't expect it to work! Architecture The Django
A Python package for searching journal publications and researchers
scholarpy A python package for searching journal publications and researchers Free software: MIT license Documentation: https://giswqs.github.io/schol
Gerador do Arquivo Magnético Sintegra em Python
pysintegra é uma lib simples com o objetivo de facilitar a geração do arquivo SINTEGRA seguindo o Convênio ICMS 57/95. Com o surgimento do SPED, muito
Swim between bookmarks in the Windows terminal
Marlin Swim between bookmarks in the terminal! Marlin is an easy to use bookmark manager for the terminal. Choose a folder, bookmark it and swim there
script to analyze EQ decay using python
pyq_decay script to analyze EQ decay using python PyQ Decay ver 1.0 A pythonic script to analyze EQ aftershock decay using method of Omori (1894), Mog
A community based economy bot with python works only with python 3.7.8 as web3 requires cytoolz
A community based economy bot with python works only with python 3.7.8 as web3 requires cytoolz has some issues building with python 3.10
This simple script generates a backup of a given Python and R environment
Python Environment Backup It’s always good to maintain your Python and R Anaconda environment packages properly listed and well-kept in case you have
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
RDFLib RDFLib is a pure Python package for working with RDF. RDFLib contains most things you need to work with RDF, including: parsers and serializers