81 Repositories
Latest Python Libraries
Very useful and necessary functions that simplify working with data
Additional-function-for-pandas Very useful and necessary functions that simplify working with data random_fill_nan(module_name, nan) - Replaces all sp
demir.ai Dataset Operations
demir.ai Dataset Operations With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine le
🍰 ConnectMP - An easy and efficient way to share data between Processes in Python.
ConnectMP - Taking Multi-Process Data Sharing to the moon 🚀 Contribute · Community · Documentation 🎫 Introduction : 🍤 ConnectMP is the easiest and
Python3 library that can retrieve Chrome-based browser's saved login info.
Passax EDUCATIONAL PURPOSES ONLY Python3 library that can retrieve Chrome-based browser's saved login info. Requirements secretstorage~=3.3.1 pywin32=
A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)
Data-Annotation-Tool How to Run this Tool? To run this software, follow the steps: git clone https://github.com/Autonomous-Car-Project/Data-Annotation
A synthetic data generator for text recognition
TextRecognitionDataGenerator A synthetic data generator for text recognition What is it for? Generating text image samples to train an OCR software. N
Download candlestick data fast & easy for analysis
crypto-candlesticks 📈 The goal behind this project is to facilitate downloading cryptocurrency candlestick data fast & simple. Currently only the Bit
Python Package for DataHerb: create, search, and load datasets.
The Python Package for DataHerb A DataHerb Core Service to Create and Load Datasets.
Python library for creating data pipelines with chain functional programming
PyFunctional Features PyFunctional makes creating data pipelines easy by using chained functional operators. Here are a few examples of what it can do
Ferramenta de monitoramento do risco de colapso no sistema de saúde em municípios brasileiros com a Covid-19.
FarolCovid 🚦 Ferramenta de monitoramento do risco de colapso no sistema de saúde em municípios brasileiros com a Covid-19. Monitoring tool & simulati
A desktop application developed in Python with PyQt5 to predict demand and help monitor and schedule brewing processes for Barnaby's Brewhouse.
brewhouse-management A desktop application developed in Python with PyQt5 to predict demand and help monitor and schedule brewing processes for Barnab
FakeDataGen is a Full Valid Fake Data Generator.
FakeDataGen is a Full Valid Fake Data Generator. This tool helps you to create fake accounts (in Spanish format) with fully valid data. Within this in
Finds Jobs on LinkedIn using web-scraping
Find Jobs on LinkedIn 📔 This program finds jobs by scraping on LinkedIn 👨💻 Relies on User Input. Accepts: Country, City, State 📑 Data about jobs
Repositório do Projeto de Jogo da Resília Educação.
Jogo da Segurança das Indústrias Acme Descrição Este jogo faz parte do projeto de entrega do primeiro módulo da Resilia Educação, referente ao curso d
A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.
A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.
Synthetic data for the people.
zpy: Synthetic data in Blender. Website • Install • Docs • Examples • CLI • Contribute • Licence Abstract Collecting, labeling, and cleaning data for
Python SDK for IEX Cloud
iexfinance Python SDK for IEX Cloud. Architecture mirrors that of the IEX Cloud API (and its documentation). An easy-to-use toolkit to obtain data for
A Python Tool to encrypt all types of files using AES and XOR Algorithm.
DataShield This project intends to protect user’s data, it stores files in encrypted format in device provided the passcode and path of the file. AES
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Knowledge Repo The Knowledge Repo project is focused on facilitating the sharing of knowledge between data scientists and other technical roles using
AKShare is an elegant and simple financial data interface library for Python, built for human beings
AKShare is an elegant and simple financial data interface library for Python, built for human beings
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
simple way to build the declarative and destributed data pipelines with python
unipipeline simple way to build the declarative and distributed data pipelines. Why you should use it Declarative strict config Scaffolding Fully type
Policy and data administration, distribution, and real-time updates on top of Open Policy Agent
⚡ OPAL ⚡ Open Policy Administration Layer OPAL is an administration layer for Open Policy Agent (OPA), detecting changes to both policy and policy dat
Upload comma-delimited files to biglocalnews.org in your GitHub Action
Upload comma-delimited files to biglocalnews.org in your GitHub Action Inputs api-key: Your biglocalnews.org API token. project-id: The identifier of
Pandas and Spark DataFrame comparison for humans
DataComPy DataComPy is a package to compare two Pandas DataFrames. Originally started to be something of a replacement for SAS's PROC COMPARE for Pand
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
CKAN: The Open Source Data Portal Software CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work
Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics
Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Links de acceso: Noteb
Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.
Data Inspector Data Inspector is an open-source python library that brings 15 types of different functions to make EDA, data cleaning easier. Author:
SpyQL - SQL with Python in the middle
SpyQL SQL with Python in the middle Concept SpyQL is a query language that combines: the simplicity and structure of SQL with the power and readabilit
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
Geospatial Data Visualization using PyGMT
Example script to visualize topographic data, earthquake data, and tomographic data on a map
create custom test databases that are populated with fake data
About Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql
create custom test databases that are populated with fake data
About Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql
Clean APIs for data cleaning. Python implementation of R package Janitor
pyjanitor pyjanitor is a Python implementation of the R package janitor, and provides a clean API for cleaning data. Why janitor? Originally a port of
🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.
🌎 JSONClasses JSONClasses is a declarative data flow pipeline and data graph framework. Official Website: https://www.jsonclasses.com Official Docume
Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale. It has been battle-tested at Lyft, Spotify, Freenome, and others and is truly open-source.
Flyte Flyte is a workflow automation platform for complex, mission-critical data, and ML processes at scale Home Page · Quick Start · Documentation ·
Datasets, tools, and benchmarks for representation learning of code.
The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided
Manage large and heterogeneous data spaces on the file system.
signac - simple data management The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, and reproduc
pandas-gbq is a package providing an interface to the Google BigQuery API from pandas
pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda
Graphsignal Logger
Graphsignal Logger Overview Graphsignal is an observability platform for monitoring and troubleshooting production machine learning applications. It h
Graphsignal is a machine learning model monitoring platform.
Graphsignal is a machine learning model monitoring platform. It helps ML engineers, MLOps teams and data scientists to quickly address issues with data and models as well as proactively analyze model
Script to generate a massive volume of data in sql, csv, json or xml format
DataGenerator Made with Python Open for pull requests 1. Dependencies To install required dependencies run pip install -r requirements.txt 2. Executi
AkShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Overview AkShare requires Python(64 bit) 3.7 or greater, aims to make fetch financial data as convenient as possible. Write less, get more! Documentat
Howell County, Missouri, COVID-19 data and (unofficial) estimates
COVID-19 in Howell County, Missouri This repository contains the daily data files used to generate my COVID-19 dashboard for Howell County, Missouri,
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
PyPika - Python Query Builder Abstract What is PyPika? PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a s
Draw datasets from within Jupyter.
drawdata This small python app allows you to draw a dataset in a jupyter notebook. This should be very useful when teaching machine learning algorithm
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes
Projeto job insights - Projeto avaliativo da Trybe do Bloco 32: Introdução à Python
Termos e acordos Ao iniciar este projeto, você concorda com as diretrizes do Código de Ética e Conduta e do Manual da Pessoa Estudante da Trybe. Boas
The purpose of this project is to share knowledge on how awesome Streamlit is and can be
Awesome Streamlit The fastest way to build Awesome Tools and Apps! Powered by Python! The purpose of this project is to share knowledge on how Awesome
The mitosheet package, trymito.io, and other public Mito code.
Mito Monorepo Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and gener
My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow
Easy Data Augmentation Implementation This repository contains my Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Per
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
easyNeuron is a simple way to create powerful machine learning models, analyze data and research cutting-edge AI.
easyNeuron is a simple way to create powerful machine learning models, analyze data and research cutting-edge AI.
Build, test, deploy, iterate - Dev and prod tool for data science pipelines
Prodmodel is a build system for data science pipelines. Users, testers, contributors are welcome! Motivation · Concepts · Installation · Usage · Contr
Extract data from a wide range of Internet sources into a pandas DataFrame.
pandas-datareader Up to date remote data access for pandas, works for multiple versions of pandas. Installation Install using pip pip install pandas-d
Pandas Google BigQuery
pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda
Pandas Google BigQuery
pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.