Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
Simple dotfile pre-processor with a per-file configuration
ix (eeks) Simple dotfile pre-processor with a per-file configuration Summary (TL;DR) ix.py is all you need config is an ini file. files to be processe
Opensource Desktop application for kenobi.
Kenobi-Server WIP Opensource desktop application for Kenobi. Download the apple watch app to get started. What is this repo? It's repo for the opensou
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods.
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods. We have to upload the image of an
Blender Add-on to Add Metal Materials to Your Scene
Blender QMM (Quick Metal Materials) Blender Addon to Add Metal Materials to Your Scene Installation Download the latest ZIP from Releases. Usage This
A example project's description is a high-level overview of why you’re doing a project.
A example project's description is a high-level overview of why you’re doing a project.
A repository containing useful resources needed to complete the SUSE Scholarship Challenge #UdacitySUSEScholars #poweredbySUSE
SUSE-udacity-cloud-native-scholarship A repository containing useful resources needed to complete the SUSE Scholarship Challenge #UdacitySUSEScholars
A test repository to build a python package and publish the package to Artifact Registry using GCB
A test repository to build a python package and publish the package to Artifact Registry using GCB. Then have the package be a dependency in a GCF function.
A place where one-off ideas/partial projects can live comfortably
A place to post ideas, partial projects, or anything else that doesn't necessarily warrant its own repo, from my mind to the web.
Official repository for the BPF Performance Tools book
BPF Performance Tools This is the official repository of BPF (eBPF) tools from the book BPF Performance Tools: Linux and Application Observability. Th
SimBiber - A tool for simplifying bibtex with official info
SimBiber: A tool for simplifying bibtex with official info. We often need to sim
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Streamify A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more! Description Objective The project will stre
BlackIP-Rep is a tool designed to gather the reputation and information of Bulk IP's.
BlackIP-Rep is a tool designed to gather the reputation and information of Bulk IP's. Focused on increasing the workflow of Security Operations(SOC) team during investigation.
Online learning platform
🛠 Status: In Development Teached is currently in development. So we encourage you to use it and give us your feedback, but there are things that have
EFB Docker image with efb-telegram-master and efb-wechat-slave
efb-wechat-docker EFB Docker image with efb-telegram-master and efb-wechat-slave Features Container run by non-root user. Support add environment vari
Semester long, web application project for CSCI 4370/6370 (Database Management)
Database_Project Prototype ideas for website: Computer Science library (Sells books, products, etc.) Code editor Graph visualizer / creator (can save
Whole-day timezone comparison
Timezone Converter Compare a full day of your local timezone with foreign ones $ timezone-converter tijuana --zone $ timezone-converter tijuana new_yo
A simple string parser based on CLR to check whether a string is acceptable or not for a given grammar.
A simple string parser based on CLR to check whether a string is acceptable or not for a given grammar.
Built as part of an assignment for S5 OOSE Subject CSE
Installation Steps: Download and install Python from here based on your operating system. I have used Python v3.8.10 for this. Clone the repository gi
The tool helps to find hidden parameters that can be vulnerable or can reveal interesting functionality that other hunters miss.
The tool helps to find hidden parameters that can be vulnerable or can reveal interesting functionality that other hunters miss. Greater accuracy is achieved thanks to the line-by-line comparison of
An electron application to check battery of bluetooth devices connected to linux devices.
bluetooth-battery-electron An electron application to check battery of bluetooth devices connected to linux devices. This project provides an electron