DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.

Overview

DALLE tools

DALLE-tools is a github repository with useful tools to categorize, annotate or check the sanity of your datasets.

Installation

Just clone this repository to your folder and use one of the following commands in the section underneath.

WebDataset Annotator

python annotator.py

Press to switch to the next page, to change the annotation category or click on the image to add it to the current cateogry and save it in annotations.json. Please upload your annotations.json by creating a push request into community_annotations folder into the folder of the dataset you used (e.g. YFCC100m, or LAION400m etc.), so everyone can use the data for better dataset annotations! If you want to continue to annotate a dataset where someone else already started, just copy the annotations.json from the community_annotations folder and the used dataset into the root directory and run the annotator!

Screenshot

WebDataset aligner

python aligner.py

This tool helps to align the shuffled keys, so the WebDataset module can read your datasets correctly. You just need to specify the keys you want to look for and keep in your new dataset.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

A GUI for Pandas DataFrames

About Demo Installation Usage Features More Info About PandasGUI is a GUI for viewing, plotting and analyzing Pandas DataFrames. Demo Installation Ins

Adam Rose 2.8k Dec 24, 2022
An easy to use burndown chart generator for GitHub Project Boards.

Burndown Chart for GitHub Projects An easy to use burndown chart generator for GitHub Project Boards. Table of Contents Features Installation Assumpti

Joseph Hale 15 Dec 28, 2022
Typical: Fast, simple, & correct data-validation using Python 3 typing.

typical: Python's Typing Toolkit Introduction Typical is a library devoted to runtime analysis, inference, validation, and enforcement of Python types

Sean 171 Jan 02, 2023
Wikipedia WordCloud App generate Wikipedia word cloud art created using python's streamlit, matplotlib, wikipedia and wordcloud packages

Wikipedia WordCloud App Wikipedia WordCloud App generate Wikipedia word cloud art created using python's streamlit, matplotlib, wikipedia and wordclou

Siva Prakash 5 Jan 02, 2022
Create HTML profiling reports from pandas DataFrame objects

Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

10k Jan 01, 2023
Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

6k Jan 02, 2023
A script written in Python that generate output custom color (HEX or RGB input to x1b hexadecimal)

ColorShell ─ 1.5 Planned for v2: setup.sh for setup alias This script converts HEX and RGB code to x1b x1b is code for colorize outputs, works on ou

Riley 4 Oct 31, 2021
Debugging, monitoring and visualization for Python Machine Learning and Data Science

Welcome to TensorWatch TensorWatch is a debugging and visualization tool designed for data science, deep learning and reinforcement learning from Micr

Microsoft 3.3k Dec 27, 2022
Fast 1D and 2D histogram functions in Python

About Sometimes you just want to compute simple 1D or 2D histograms with regular bins. Fast. No nonsense. Numpy's histogram functions are versatile, a

Thomas Robitaille 237 Dec 18, 2022
Python scripts for plotting audiograms and related data from Interacoustics Equinox audiometer and Otoaccess software.

audiometry Python scripts for plotting audiograms and related data from Interacoustics Equinox 2.0 audiometer and Otoaccess software. Maybe similar sc

Hamilton Lab at UT Austin 2 Jun 15, 2022
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data

FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick

Synthesized 69 Dec 15, 2022
Easily configurable, chart dashboards from any arbitrary API endpoint. JSON config only

Flask JSONDash Easily configurable, chart dashboards from any arbitrary API endpoint. JSON config only. Ready to go. This project is a flask blueprint

Chris Tabor 3.3k Dec 31, 2022
Make sankey, alluvial and sankey bump plots in ggplot

The goal of ggsankey is to make beautiful sankey, alluvial and sankey bump plots in ggplot2

David Sjoberg 156 Jan 03, 2023
GitHub English Top Charts

Help you discover excellent English projects and get rid of the interference of other spoken language.

kon9chunkit 529 Jan 02, 2023
Missing data visualization module for Python.

missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha

Aleksey Bilogur 3.4k Dec 29, 2022
kyle's vision of how datadog's python client should look

kyle's datadog python vision/proposal not for production use See examples/comprehensive.py for a mostly working example of the proposed API. 📈 🐶 ❤️

Kyle Verhoog 2 Nov 21, 2021
Friday Night Funkin - converts a chart from 4/4 time to 6/8 time, or from regular to swing tempo.

Chart to swing converter As seen in https://twitter.com/i_winxd/status/1462220493558366214 A program written in python that converts a chart from 4/4

5 Dec 23, 2022
This is a web application to visualize various famous technical indicators and stocks tickers from user

Visualizing Technical Indicators Using Python and Plotly. Currently facing issues hosting the application on heroku. As soon as I am able to I'll like

4 Aug 04, 2022
Cryptocurrency Centralized Exchange Visualization

This is a simple one that uses Grafina to visualize cryptocurrency from the Bitkub exchange. This service will make a request to the Bitkub API from your wallet and save the response to Postgresql. G

Popboon Mahachanawong 1 Nov 24, 2021
Example Code Notebooks for Data Visualization in Python

This repository contains sample code scripts for creating awesome data visualizations from scratch using different python libraries (such as matplotli

Javed Ali 27 Jan 04, 2023