Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Related tags

Deep LearningFoID
Overview

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon!

💡 Collated best practices from most popular ML research repositories - now official guidelines at NeurIPS 2021!

Based on analysis of more than 200 Machine Learning repositories, these recommendations facilitate reproducibility and correlate with GitHub stars - for more details, see our our blog post.

For NeurIPS 2021 code submissions it is recommended (but not mandatory) to use the README.md template and check as many items on the ML Code Completeness Checklist (described below) as possible.

📋 README.md template

We provide a README.md template that you can use for releasing ML research repositories. The sections in the template were derived by looking at existing repositories, seeing which had the best reception in the community, and then looking at common components that correlate with popularity.

✓ ML Code Completeness Checklist

We compiled this checklist by looking at what's common to the most popular ML research repositories. In addition, we prioritized items that facilitate reproducibility and make it easier for others build upon research code.

The ML Code Completeness Checklist consists of five items:

  1. Specification of dependencies
  2. Training code
  3. Evaluation code
  4. Pre-trained models
  5. README file including table of results accompanied by precise commands to run/produce those results

We verified that repositories that check more items on the checklist also tend to have a higher number of GitHub stars. This was verified by analysing official NeurIPS 2019 repositories - more details in the blog post. We also provide the data and notebook to reproduce this analysis from the post.

NeurIPS 2019 repositories that had all five of these components had the highest number of GitHub stars (median of 196 and mean of 2,664 stars).

We explain each item on the checklist in detail blow.

1. Specification of dependencies

If you are using Python, this means providing a requirements.txt file (if using pip and virtualenv), providing environment.yml file (if using anaconda), or a setup.py if your code is a library.

It is good practice to provide a section in your README.md that explains how to install these dependencies. Assume minimal background knowledge and be clear and comprehensive - if users cannot set up your dependencies they are likely to give up on the rest of your code as well.

If you wish to provide whole reproducible environments, you might want to consider using Docker and upload a Docker image of your environment into Dockerhub.

2. Training code

Your code should have a training script that can be used to obtain the principal results stated in the paper. This means you should include hyperparameters and any tricks that were used in the process of getting your results. To maximize usefulness, ideally this code should be written with extensibility in mind: what if your user wants to use the same training script on their own dataset?

You can provide a documented command line wrapper such as train.py to serve as a useful entry point for your users.

3. Evaluation code

Model evaluation and experiments often depend on subtle details that are not always possible to explain in the paper. This is why including the exact code you used to evaluate or run experiments is helpful to give a complete description of the procedure. In turn, this helps the user to trust, understand and build on your research.

You can provide a documented command line wrapper such as eval.py to serve as a useful entry point for your users.

4. Pre-trained models

Training a model from scratch can be time-consuming and expensive. One way to increase trust in your results is to provide a pre-trained model that the community can evaluate to obtain the end results. This means users can see the results are credible without having to train afresh.

Another common use case is fine-tuning for downstream task, where it's useful to release a pretrained model so others can build on it for application to their own datasets.

Lastly, some users might want to try out your model to see if it works on some example data. Providing pre-trained models allows your users to play around with your work and aids understanding of the paper's achievements.

5. README file includes table of results accompanied by precise command to run to produce those results

Adding a table of results into README.md lets your users quickly understand what to expect from the repository (see the README.md template for an example). Instructions on how to reproduce those results (with links to any relevant scripts, pretrained models etc) can provide another entry point for the user and directly facilitate reproducibility. In some cases, the main result of a paper is a Figure, but that might be more difficult for users to understand without reading the paper.

You can further help the user understand and contextualize your results by linking back to the full leaderboard that has up-to-date results from other papers. There are multiple leaderboard services where this information is stored.

🎉 Additional awesome resources for releasing research code

Hosting pretrained models files

  1. Zenodo - versioning, 50GB, free bandwidth, DOI, provides long-term preservation
  2. GitHub Releases - versioning, 2GB file limit, free bandwidth
  3. OneDrive - versioning, 2GB (free)/ 1TB (with Office 365), free bandwidth
  4. Google Drive - versioning, 15GB, free bandwidth
  5. Dropbox - versioning, 2GB (paid unlimited), free bandwidth
  6. AWS S3 - versioning, paid only, paid bandwidth
  7. huggingface_hub - versioning, no size limitations, free bandwidth
  8. DAGsHub - versioning, no size limitations, free bandwith
  9. CodaLab Worksheets - 10GB, free bandwith

Managing model files

  1. RClone - provides unified access to many different cloud storage providers

Standardized model interfaces

  1. PyTorch Hub
  2. Tensorflow Hub
  3. Hugging Face NLP models
  4. ONNX

Results leaderboards

  1. Papers with Code leaderboards - with 4000+ leaderboards
  2. CodaLab Competitions - with 450+ leaderboards
  3. EvalAI - with 100+ leaderboards
  4. NLP Progress - with 90+ leaderboards
  5. Collective Knowledge - with 40+ leaderboards
  6. Weights & Biases - Benchmarks - with 9+ leaderboards

Making project pages

  1. GitHub pages
  2. Fastpages

Making demos, tutorials, executable papers

  1. Google Colab
  2. Binder
  3. Streamlit
  4. CodaLab Worksheets

Contributing

If you'd like to contribute, or have any suggestions for these guidelines, you can contact us at [email protected] or open an issue on this GitHub repository.

All contributions welcome! All content in this repository is licensed under the MIT license.

Pytorch0.4.1 codes for InsightFace

InsightFace_Pytorch Pytorch0.4.1 codes for InsightFace 1. Intro This repo is a reimplementation of Arcface(paper), or Insightface(github) For models,

1.5k Jan 01, 2023
An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification The repostiory consists of the code, results and data set links for

12 Dec 26, 2022
Fair Recommendation in Two-Sided Platforms

Fair Recommendation in Two-Sided Platforms

gourabgggg 1 Nov 10, 2021
Spectralformer: Rethinking hyperspectral image classification with transformers

Spectralformer: Rethinking hyperspectral image classification with transformers Danfeng Hong, Zhu Han, Jing Yao, Lianru Gao, Bing Zhang, Antonio Plaza

Danfeng Hong 102 Dec 29, 2022
Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

Assignment Write a function named add_time that takes in two required parameters and one optional parameter: a start time in the 12-hour clock format

Hellen Namulinda 0 Feb 26, 2022
This repo contains the code for paper Inverse Weighted Survival Games

Inverse-Weighted-Survival-Games This repo contains the code for paper Inverse Weighted Survival Games instructions general loss function (--lfn) can b

3 Jan 12, 2022
ML From Scratch

ML from Scratch MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Clustering K Nearest Neighbours Decision

Tanishq Gautam 66 Nov 02, 2022
Human Detection - Pedestrian Detection using OpenCV Python

Pedestrian Detection using OpenCV Python Follow us on Instagram for Machine Lear

Hrishikesh Dutta 1 Jan 23, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

43 Nov 21, 2022
Automatically creates genre collections for your Plex media

Plex Auto Genres Plex Auto Genres is a simple script that will add genre collection tags to your media making it much easier to search for genre speci

Shane Israel 63 Dec 31, 2022
An official PyTorch implementation of the TKDE paper "Self-Supervised Graph Representation Learning via Topology Transformations".

Self-Supervised Graph Representation Learning via Topology Transformations This repository is the official PyTorch implementation of the following pap

Hsiang Gao 2 Oct 31, 2022
MTCNN face detection implementation for TensorFlow, as a PIP package.

MTCNN Implementation of the MTCNN face detector for Keras in Python3.4+. It is written from scratch, using as a reference the implementation of MTCNN

Iván de Paz Centeno 1.9k Dec 30, 2022
This program creates a formatted excel file which highlights the undervalued stock according to Graham's number.

Over-and-Undervalued-Stocks Of Nepse Using Graham's Number Scrap the latest data using different websites and creates a formatted excel file that high

6 May 03, 2022
Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D) Code & Data Appendix for Conjugated Discrete Distributions for Distr

1 Jan 11, 2022
Contenido del curso Bases de datos del DCC PUC versión 2021-2

IIC2413 - Bases de Datos Tabla de contenidos Equipo Profesores Ayudantes Contenidos Calendario Evaluaciones Resumen de notas Foro Política de integrid

54 Nov 23, 2022
Magisk module to enable hidden features on Android 12 Developer Preview 1.

Android 12 Extensions This is a Magisk module that enables hidden features on Android 12 Developer Preview 1. Features Scrolling screenshots Wallpaper

Danny Lin 384 Jan 06, 2023
Learning a mapping from images to psychological similarity spaces with neural networks.

LearningPsychologicalSpaces v0.1: v1.1: v1.2: v1.3: v1.4: v1.5: The code in this repository explores learning a mapping from images to psychological s

Lucas Bechberger 8 Dec 12, 2022
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks Overview Unsupervised phone and word segmentation on speech data is pe

Herman Kamper 13 Dec 11, 2022
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

Soomin Park 28 Dec 16, 2022
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

COCO-LM This repository contains the scripts for fine-tuning COCO-LM pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: COCO-LM: Correcting an

Microsoft 106 Dec 12, 2022