DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.

Overview

DALLE tools

DALLE-tools is a github repository with useful tools to categorize, annotate or check the sanity of your datasets.

Installation

Just clone this repository to your folder and use one of the following commands in the section underneath.

WebDataset Annotator

python annotator.py

Press to switch to the next page, to change the annotation category or click on the image to add it to the current cateogry and save it in annotations.json. Please upload your annotations.json by creating a push request into community_annotations folder into the folder of the dataset you used (e.g. YFCC100m, or LAION400m etc.), so everyone can use the data for better dataset annotations! If you want to continue to annotate a dataset where someone else already started, just copy the annotations.json from the community_annotations folder and the used dataset into the root directory and run the annotator!

Screenshot

WebDataset aligner

python aligner.py

This tool helps to align the shuffled keys, so the WebDataset module can read your datasets correctly. You just need to specify the keys you want to look for and keep in your new dataset.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Multi-class confusion matrix library in Python

Table of contents Overview Installation Usage Document Try PyCM in Your Browser Issues & Bug Reports Todo Outputs Dependencies Contribution References

Sepand Haghighi 1.3k Dec 31, 2022
NumPy and Pandas interface to Big Data

Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar inte

Blaze 3.1k Jan 01, 2023
Learn Basic to advanced level Data visualisation techniques from this Repository

Data visualisation Hey, You can learn Basic to advanced level Data visualisation techniques from this Repository. Data visualization is the graphic re

Shashank dwivedi 16 Jan 03, 2023
Jupyter Notebook extension leveraging pandas DataFrames by integrating DataTables and ChartJS.

Jupyter DataTables Jupyter Notebook extension to leverage pandas DataFrames by integrating DataTables JS. About Data scientists and in fact many devel

Marek Čermák 142 Dec 28, 2022
Visualize your pandas data with one-line code

PandasEcharts 简介 基于pandas和pyecharts的可视化工具 安装 pip 安装 $ pip install pandasecharts 源码安装 $ git clone https://github.com/gamersover/pandasecharts $ cd pand

陈华杰 2 Apr 13, 2022
Blender addon that creates a temporary window of any type from the 3D View.

CreateTempWindow2.8 Blender addon that creates a temporary window of any type from the 3D View. Features Can the following window types: 3D View Graph

3 Nov 27, 2022
Geospatial Data Visualization using PyGMT

Example script to visualize topographic data, earthquake data, and tomographic data on a map

Utpal Kumar 2 Jul 30, 2022
An animation engine for explanatory math videos

Powered By: An animation engine for explanatory math videos Hi there, I'm Zheer 👋 I'm a Software Engineer and student!! 🌱 I’m currently learning eve

Zaheer ud Din Faiz 2 Nov 04, 2021
JSNAPY example: Validate NAT policies

JSNAPY example: Validate NAT policies Overview This example will show how to use JSNAPy to make sure the expected NAT policy matches are taking place.

Calvin Remsburg 1 Jan 07, 2022
股票行情实时数据接口-A股,完全免费的沪深证券股票数据-中国股市,python最简封装的API接口

股票行情实时数据接口-A股,完全免费的沪深证券股票数据-中国股市,python最简封装的API接口,包含日线,历史K线,分时线,分钟线,全部实时采集,系统包括新浪腾讯双数据核心采集获取,自动故障切换,STOCK数据格式成DataFrame格式,可用来查询研究量化分析,股票程序自动化交易系统.为量化研究者在数据获取方面极大地减轻工作量,更加专注于策略和模型的研究与实现。

dev 572 Jan 08, 2023
Python package to Create, Read, Write, Edit, and Visualize GSFLOW models

pygsflow pyGSFLOW is a python package to Create, Read, Write, Edit, and Visualize GSFLOW models API Documentation pyGSFLOW API documentation can be fo

pyGSFLOW 21 Dec 14, 2022
A streamlit component for bi-directional communication with bokeh plots.

Streamlit Bokeh Events A streamlit component for bi-directional communication with bokeh plots. Its just a workaround till streamlit team releases sup

Ashish Shukla 123 Dec 25, 2022
Small project to recursively calculate and plot each successive order of the Hilbert Curve

hilbert-curve Small project to recursively calculate and plot each successive order of the Hilbert Curve. After watching 3Blue1Brown's video on Hilber

Stefan Mejlgaard 2 Nov 15, 2021
`charts.css.py` brings `charts.css` to Python. Online documentation and samples is available at the link below.

charts.css.py charts.css.py provides a python API to convert your 2-dimension data lists into html snippet, which will be rendered into charts by CSS,

Ray Luo 3 Sep 23, 2021
python partial dependence plot toolbox

PDPbox python partial dependence plot toolbox Motivation This repository is inspired by ICEbox. The goal is to visualize the impact of certain feature

Li Jiangchun 723 Jan 07, 2023
Data Visualization Guide for Presentations, Reports, and Dashboards

This is a highly practical and example-based guide on visually representing data in reports and dashboards.

Anton Zhiyanov 395 Dec 29, 2022
A python script editor for napari based on PyQode.

napari-script-editor A python script editor for napari based on PyQode. This napari plugin was generated with Cookiecutter using with @napari's cookie

Robert Haase 9 Sep 20, 2022
Fast scatter density plots for Matplotlib

About Plotting millions of points can be slow. Real slow... 😴 So why not use density maps? ⚡ The mpl-scatter-density mini-package provides functional

Thomas Robitaille 473 Dec 12, 2022
A guide for using Bootstrap 5 classes in Dash Bootstrap Components V1

dash-bootstrap-cheatsheet This handy interactive cheatsheet makes it easy to use the Bootstrap 5 classes with your Dash app made with the latest versi

10 Dec 22, 2022
nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation.

nptsne nptsne is a numpy compatible python binary package that offers a number of APIs for fast tSNE calculation and HSNE modelling. For more detail s

Biomedical Visual Analytics Unit LUMC - TU Delft 29 Jul 05, 2022