Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

Overview

SW-CV-ModelZoo

Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset


Framework: TF/Keras 2.7

Training SQLite DB built using fire-egg's tools: https://github.com/fire-eggs/Danbooru2019

Currently training on Danbooru2021, 512px SFW subset (sans the rating:q images that had been included in the 2022-01-21 release of the dataset)

Reference:

Anonymous, The Danbooru Community, & Gwern Branwen; “Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, 2022-01-21. Web. Accessed 2022-01-28 https://www.gwern.net/Danbooru2021


Journal

06/02/2022: great news crew! TRC allowed me to use a bunch of TPUs!

To make better use of this amount of compute I had to overhaul a number of components, so a bunch of things are likely to have fallen to bitrot in the process. I can only guarantee NFNet can work pretty much as before with the right arguments.
NFResNet changes should have left it retrocompatible with the previous version.
ResNet has been streamlined to be mostly in line with the Bag-of-Tricks paper (arXiv:1812.01187) with the exception of the stem. It is not compatible with the previous version of the code.

The training labels have been included in the 2021_0000_0899 folder for convenience.
The list of files used for training is going to be uploaded as a GitHub Release.

Now for some numbers:
compared to my previous best run, the one that resulted in NFNetL1V1-100-0.57141:

  • I'm using 1.86x the amount of images: 2.8M vs 1.5M
  • I'm training bigger models: 61M vs 45M params
  • ... in less time: 232 vs 700 hours of processor time
  • don't get me started on actual wall clock time
  • with a few amenities thrown in: ECA for channel attention, SiLU activation

And it's all thanks to the folks at TRC, so shout out to them!

I currently have a few runs in progress across a couple of dimensions:

  • effect of model size with NFNet L0/L1/L2, with SiLU and ECA for all three of them
  • effect of activation function with NFNet L0, with SiLU/HSwish/ReLU, no ECA

Once the experiments are over, the plan is to select the network definitions that lay on the Pareto curve between throughput and F1 score and release the trained weights.

One last thing.
I'd like to call your attention to the tools/cleanlab_stuff.py script.
It reads two files: one with the binarized labels from the database, the other with the predicted probabilities.
It then uses the cleanlab package to estimate whether if an image in a set could be missing a given label. At the end it stores its conclusions in a json file.
This file could, potentially, be used in some tool to assist human intervention to add the missing tags.

You might also like...
Human head pose estimation using Keras over TensorFlow.
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos • Key Features • Examples • Install • Support • Docs • Issues • License • Download Talos radically changes

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Releases(models_db2021_5500_2022_10_21)
  • models_db2021_5500_2022_10_21(Oct 21, 2022)

    ConvNext B, ViT B16
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    top 5500 tags (2021_0000_0899_5500/selected_tags.csv)
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_09_25_2022_05h13m55s | B | 93.2M | 448 | 0.3673 | 0.6941 | | ViTB16_09_25_2022_04h53m38s | B16 | 90.5M | 448 | 0.3663 | 0.6918 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_09_25_2022_05h13m55s.7z(322.58 MB)
    ViTB16_09_25_2022_04h53m38s.7z(312.96 MB)
  • convnexts_db2021_2022_03_22(Mar 22, 2022)

    ConvNext, T/S/B
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_03_10_2022_21h41m23s | B | 90.01M | 448 | 0.3372 | 0.6892 | | ConvNextSV1_03_11_2022_17h49m56s | S | 51.28M | 384 | 0.3301 | 0.6798 | | ConvNextTV1_03_05_2022_15h56m42s | T | 29.65M | 320 | 0.3259 | 0.6595 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_03_10_2022_21h41m23s.7z(311.29 MB)
    ConvNextSV1_03_11_2022_17h49m56s.7z(177.36 MB)
    ConvNextTV1_03_05_2022_15h56m42s.7z(102.96 MB)
  • nfnets_db2021_2022_03_04(Mar 4, 2022)

    NFNet, L0/L1/L2 (based on timm Lx model definitions) Trained on Danbooru2021 512px SFW subset, modulos 0000-0899 alpha to white padding to make the image square is white channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | NFNetL2V1_02_20_2022_10h27m08s | L2 | 60.96M | 448 | 0.3231 | 0.6785 | | NFNetL1V1_02_17_2022_20h18m38s | L1 | 45.65M | 384 | 0.3259 | 0.6691 | | NFNetL0V1_02_10_2022_17h50m14s | L0 | 27.32M | 320 | 0.3190 | 0.6509 |

    Source code(tar.gz)
    Source code(zip)
    NFNetL0V1_02_10_2022_17h50m14s.7z(94.98 MB)
    NFNetL1V1_02_17_2022_20h18m38s.7z(157.97 MB)
    NFNetL2V1_02_20_2022_10h27m08s.7z(210.49 MB)
  • nfnet_tpu_training(Feb 6, 2022)

  • NFNetL1V1-100-0.57141(Dec 31, 2021)

    • NFNet, L1 (based on timm Lx model definitions), 100 epochs, F1 @ 0.4 at the end of the 100th epoch was 0.57141
    • Trained on Danbooru2020 512px SFW subset, modulos 0000-0599
    • 320px per side
    • alpha to white
    • padding to make the image square is white
    • channel order is BGR, scaled to 0-1
    • mixup alpha = 0.2 during epochs 76-100
    • analyze_metrics on Danbooru2020 original set, modulos 0984-0999: {'thres': 0.3485, 'F1': 0.6133, 'F2': 0.6133, 'MCC': 0.6094, 'A': 0.9923, 'R': 0.6133, 'P': 0.6133}
    • analyze_metrics on image IDs 4970000-5000000: {'thres': 0.3148, 'F1': 0.5942, 'F2': 0.5941, 'MCC': 0.5892, 'A': 0.9901, 'R': 0.5940, 'P': 0.5943}
    Source code(tar.gz)
    Source code(zip)
    NFNetL1V1-100-0.57141.7z(158.09 MB)
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022
Transfer Learning library for Deep Neural Networks.

Transfer and meta-learning in Python Each folder in this repository corresponds to a method or tool for transfer/meta-learning. xfer-ml is a standalon

Amazon 245 Dec 08, 2022
This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a

79 Dec 27, 2022
Capsule endoscopy detection DACON challenge

capsule_endoscopy_detection (DACON Challenge) Overview Yolov5, Yolor, mmdetection기반의 모델을 사용 (총 11개 모델 앙상블) 모든 모델은 학습 시 Pretrained Weight을 yolov5, yolo

MAILAB 11 Nov 25, 2022
QI-Q RoboMaster2022 CV Algorithm

QI-Q RoboMaster2022 CV Algorithm

2 Jan 10, 2022
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Introduction 1. Usage (For MSS) 1.1 Prepare running environment 1.2 Use pretrained model 1.3 Train new MSS models from scratch 1.3.1 How to train 1.3.

Leo 100 Dec 25, 2022
Official Pytorch Implementation of GraphiT

GraphiT: Encoding Graph Structure in Transformers This repository implements GraphiT, described in the following paper: Grégoire Mialon*, Dexiong Chen

Inria Thoth 80 Nov 27, 2022
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
BOOKSUM: A Collection of Datasets for Long-form Narrative Summarization

BOOKSUM: A Collection of Datasets for Long-form Narrative Summarization Authors: Wojciech Kryściński, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong,

Salesforce 125 Dec 31, 2022
optimization routines for hyperparameter tuning

Hyperopt: Distributed Hyperparameter Optimization Hyperopt is a Python library for serial and parallel optimization over awkward search spaces, which

Marc Claesen 398 Nov 09, 2022
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Emma Rocheteau 76 Dec 22, 2022
PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

Sharpness-aware Quantization for Deep Neural Networks This is the official repository for our paper: Sharpness-aware Quantization for Deep Neural Netw

Zhuang AI Group 30 Dec 19, 2022
Deep and online learning with spiking neural networks in Python

Introduction The brain is the perfect place to look for inspiration to develop more efficient neural networks. One of the main differences with modern

Jason Eshraghian 447 Jan 03, 2023
Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

cross-platform-profile-pic-changer script to change profile pictures across mult

4 Jan 17, 2022
Shape-Adaptive Selection and Measurement for Oriented Object Detection

Source Code of AAAI22-2171 Introduction The source code includes training and inference procedures for the proposed method of the paper submitted to t

houliping 24 Nov 29, 2022
Seg-Torch for Image Segmentation with Torch

Seg-Torch for Image Segmentation with Torch This work was sparked by my personal research on simple segmentation methods based on deep learning. It is

Eren Gölge 37 Dec 12, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 01, 2023
CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

crfill Usage | Web App | | Paper | Supplementary Material | More results | code for paper ``CR-Fill: Generative Image Inpainting with Auxiliary Contex

182 Dec 20, 2022
A simple consistency training framework for semi-supervised image semantic segmentation

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation PseudoSeg is a simple consistency training framework for semi-supervised image semantic s

Google Interns 143 Dec 13, 2022