Focus on Algorithm Design, Not on Data Wrangling

Overview

The visual data management platform from Zensors.


Join for free at app.datatap.dev.

The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.


Documentation

Full documentation is available at docs.datatap.dev.

Features

  • Begin training instantly
  • 🔥 Works with all major ML frameworks (Pytorch, TensorFlow, etc.)
  • 🛰️ Real-time streaming to avoid large dataset downloads
  • 🌐 Universal data format for simple data exchange
  • 🎨 Combine data from multiples sources into a single dataset easily
  • 🧮 Rich ML utilities to compute PR-curves, confusion matrices, and accuracy metrics.
  • 💽 Free access to a variety of open datasets.

Getting Started (Platform)

To begin, select a dataset from the dataTap repository.

Then copy the starter code based on your library preference.

Paste the starter code and start training.

Getting Started (API)

Install the client library.

pip install datatap

Register at app.datatap.dev. Then, go to Settings > Api Keys to find your personal API key.

export DATATAP_API_KEY="XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXX"

Start using open datasets instantly.

from datatap import Api

api = Api()
coco = api.get_default_database().get_repository("_/coco")
dataset = coco.get_dataset("latest")
print("COCO: ", dataset)

Data Streaming Example

import itertools
from datatap import Api

api = Api()
dataset = (api
    .get_default_database()
    .get_repository("_/wider-person")
    .get_dataset("latest")
)

training_stream = dataset_version.stream_split("training")
for annotation in itertools.islice(training_stream, 5):
    print("Received annotation:", annotation)

More Examples

Support and FAQ

Q. How do I resolve a missing API Key?

If you see the error Exception: No API key available. Either provide it or use the [DATATAP_API_KEY] environment variable, then the dataTap library was not able to find your API key. You can find your API key on app.datatap.dev under settings. You can either set it as an environment variable or as the first argument to the Api constructor.

Q. Can dataTap be used offline?

Some functionality can be used offline, such as the droplet utilities and metrics. However, repository access and dataset streaming require internet access, even for local databases.

Q. Is dataTap accepting contributions?

dataTap currently uses a separate code review system for managing contributions. The team is looking into switching that system to GitHub to allow public contributions. Until then, we will actively monitor the GitHub issue tracker to help accomodate the community's needs.

Q. How can I get help using dataTap?

You can post a question in the issue tracker. The dataTap team actively monitors the repository, and will try to get back to you as soon as possible.

A streamlit component for bi-directional communication with bokeh plots.

Streamlit Bokeh Events A streamlit component for bi-directional communication with bokeh plots. Its just a workaround till streamlit team releases sup

Ashish Shukla 123 Dec 25, 2022
A curated list of awesome Dash (plotly) resources

Awesome Dash A curated list of awesome Dash (plotly) resources Dash is a productive Python framework for building web applications. Written on top of

Luke Singham 1.7k Dec 26, 2022
Log visualizer for whirl-framework

Lumberjack Log visualizer for whirl-framework Установка pip install -r requirements.txt Как пользоваться python3 lumberjack.py -l путь до лога -o

Vladimir Malinovskii 2 Dec 19, 2022
Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

Spotify 3.2k Jan 04, 2023
Area-weighted venn-diagrams for Python/matplotlib

Venn diagram plotting routines for Python/Matplotlib Routines for plotting area-weighted two- and three-circle venn diagrams. Installation The simples

Konstantin Tretyakov 400 Dec 31, 2022
erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes

erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes. Diagrams are rendered using the venerable Graphviz library.

DrivenData 129 Jan 04, 2023
ecoglib: visualization and statistics for high density microecog signals

ecoglib: visualization and statistics for high density microecog signals This library contains high-level analysis tools for "topos" and "chronos" asp

1 Nov 17, 2021
A D3.js plugin that produces flame graphs from hierarchical data.

d3-flame-graph A D3.js plugin that produces flame graphs from hierarchical data. If you don't know what flame graphs are, check Brendan Gregg's post.

Martin Spier 740 Dec 29, 2022
Parse Robinhood 1099 Tax Document from PDF into CSV

Robinhood 1099 Parser This project converts Robinhood Securities 1099 tax document from PDF to CSV file. This tool will be helpful for those who need

Keun Tae (Kevin) Park 52 Jun 10, 2022
GitHub Stats Visualizations : Transparent

GitHub Stats Visualizations : Transparent Generate visualizations of GitHub user and repository statistics using GitHub Actions. ⚠️ Disclaimer The pro

YuanYap 7 Apr 05, 2022
Script to create an animated data visualisation for categorical timeseries data - GIF choropleth map with annotations.

choropleth_ldn Simple script to create a chloropleth map of London with categorical timeseries data. The script in main.py creates a gif of the most f

1 Oct 07, 2021
Fast 1D and 2D histogram functions in Python

About Sometimes you just want to compute simple 1D or 2D histograms with regular bins. Fast. No nonsense. Numpy's histogram functions are versatile, a

Thomas Robitaille 237 Dec 18, 2022
Project coded in Python using Pandas to look at changes in chase% for batters facing a pitcher first time through the order vs. thrid time

Project coded in Python using Pandas to look at changes in chase% for batters facing a pitcher first time through the order vs. thrid time

Jason Kraynak 1 Jan 07, 2022
Because trello only have payed options to generate a RunUp chart, this solves that!

Trello Runup Chart Generator The basic concept of the project is that Corello is pay-to-use and want to use Trello To-Do/Doing/Done automation with gi

Rômulo Schiavon 1 Dec 21, 2021
Custom ROI in Computer Vision Applications

EasyROI Helper library for drawing ROI in Computer Vision Applications Table of Contents EasyROI Table of Contents About The Project Tech Stack File S

43 Dec 09, 2022
Piglet-shaders - PoC of custom shaders for Piglet

Piglet custom shader PoC This is a PoC for compiling Piglet fragment shaders usi

6 Mar 10, 2022
Insert SVGs into matplotlib

Insert SVGs into matplotlib

Andrew White 35 Dec 29, 2022
a robust room presence solution for home automation with nearly no false negatives

Argos Room Presence This project builds a room presence solution on top of Argos. Using just a cheap raspberry pi zero w (plus an attached pi camera,

Angad Singh 46 Sep 18, 2022
China and India Population and GDP Visualization

China and India Population and GDP Visualization Historical Population Comparison between India and China This graph shows the population data of Indi

Nicolas De Mello 10 Oct 27, 2021
Create SVG drawings from vector geodata files (SHP, geojson, etc).

SVGIS Create SVG drawings from vector geodata files (SHP, geojson, etc). SVGIS is great for: creating small multiples, combining lots of datasets in a

Neil Freeman 78 Dec 09, 2022