Scenic: A Jax Library for Computer Vision and Beyond

Overview

Scenic

Scenic is a codebase with a focus on research around attention-based models for computer vision. Scenic has been successfully used to develop classification, segmentation, and detection models for multiple modalities including images, video, audio, and multimodal combinations of them.

More precisely, Scenic is a (i) set of shared light-weight libraries solving tasks commonly encountered tasks when training large-scale (i.e. multi-device, multi-host) vision models; and (ii) a number of projects containing fully fleshed out problem-specific training and evaluation loops using these libraries.

Scenic is developed in JAX and uses Flax.

What we offer

Among others Scenic provides

  • Boilerplate code for launching experiments, summary writing, logging, profiling, etc;
  • Optimized training and evaluation loops, losses, metrics, bi-partite matchers, etc;
  • Input-pipelines for popular vision datasets;
  • Baseline models, including strong non-attentional baselines.

Papers using Scenic

Scenic can be used to reproduce the results from the following papers, which were either developed using Scenic, or have been reimplemented in Scenic:

Philosophy

Scenic aims to facilitate rapid prototyping of large-scale vision models. To keep the code simple to understand and extend we prefer forking and copy-pasting over adding complexity or increasing abstraction. Only when functionality proves to be widely useful across many models and tasks it may be upstreamed to Scenic's shared libraries.

Code structure

Shared libraries provided by Scenic are split into:

  • dataset_lib: Implements IO pipelines for loading and pre-processing data for common Computer Vision tasks and benchmarks. All pipelines are designed to be scalable and support multi-host and multi-device setups, taking care of dividing data among multiple hosts, incomplete batches, caching, pre-fetching, etc.
  • model_lib: Provides (i) several abstract model interfaces (e.g. ClassificationModel or SegmentationModel in model_lib.base_models) with task-specific losses and metrics; (ii) neural network layers in model_lib.layers, focusing on efficient implementation of attention and transfomer layers; and (iii) accelerator-friedly implementations of bipartite matching algorithms in model_lib.matchers.
  • train_lib: Provides tools for constructing training loops and implements several example trainers (classification trainer and segmentation trainer).
  • common_lib: Utilities that do not belong anywhere else.

Projects

Models built on top of Scenic exist as separate projects. Model-specific code such as configs, layers, losses, network architectures, or training and evaluation loops exist as separate projects.

Common baselines such as a ResNet or a Visual Transformer (ViT) are implemented in the projects/baselines project. Forking this directory is a good starting point for new projects.

There is no one-fits-all recipe for how much code should be re-used by projects. Projects can fall anywhere on the wide spectrum of code re-use: from defining new configs for an existing model to redefining models, training loop, logging, etc.

Getting started

  • See projects/baselines/README.md for a walk-through baseline models and instructions on how to run the code.
  • If you would like to to contribute to Scenic, please check out the Philisophy, Code structure and Contributing sections. Should your contribution be a part of the shared libraries, please send us a pull request!

Quick start

Download the code from GitHub

git clone https://github.com/google-research/scenic.git
cd scenic
pip install .

and run training for ViT on ImageNet:

python main.py -- \
  --config=projects/baselines/configs/imagenet/imagenet_vit_config.py \
  --workdir=./

Disclaimer: This is not an official Google product.

Owner
Google Research
Google Research
Implementation of TimeSformer, a pure attention-based solution for video classification

TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.

Phil Wang 602 Jan 03, 2023
Contrastive Loss Gradient Attack (CLGA)

Contrastive Loss Gradient Attack (CLGA) Official implementation of Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation, WWW22 Bu

12 Dec 23, 2022
A deep learning based semantic search platform that computes similarity scores between provided query and documents

semanticsearch This is a deep learning based semantic search platform that computes similarity scores between provided query and documents. Documents

1 Nov 30, 2021
Pytorch implementation of "Neural Wireframe Renderer: Learning Wireframe to Image Translations"

Neural Wireframe Renderer: Learning Wireframe to Image Translations Pytorch implementation of ideas from the paper Neural Wireframe Renderer: Learning

Yuan Xue 7 Nov 14, 2022
Codebase for the Summary Loop paper at ACL2020

Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training

Canny Lab @ The University of California, Berkeley 44 Nov 04, 2022
[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)

Adversarial Attacks on Graph Classification via Bayesian Optimisation @ NeurIPS 2021 This repository contains the official implementation of GRABNEL,

Xingchen Wan 12 Dec 23, 2022
SelfRemaster: SSL Speech Restoration

SelfRemaster: Self-Supervised Speech Restoration Official implementation of SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesi

Takaaki Saeki 46 Jan 07, 2023
Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

UCL Natural Language Processing 71 Dec 29, 2022
SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021)

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021) This repository contains the official PyTorch implementa

Qianli Ma 133 Jan 05, 2023
Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

MSAD Multi-Scale Aligned Distillation for Low-Resolution Detection Lu Qi*, Jason Kuen*, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya J

Jia Research Lab 115 Dec 23, 2022
Experiments with differentiable stacks and queues in PyTorch

Please use stacknn-core instead! StackNN This project implements differentiable stacks and queues in PyTorch. The data structures are implemented in s

Will Merrill 141 Oct 06, 2022
Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

DeltaConv [Paper] [Project page] Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds" by Ru

98 Nov 26, 2022
thundernet ncnn

MMDetection_Lite 基于mmdetection 实现一些轻量级检测模型,安装方式和mmdeteciton相同 voc0712 voc 0712训练 voc2007测试 coco预训练 thundernet_voc_shufflenetv2_1.5 input shape mAP 320

DayBreak 39 Dec 05, 2022
Implementation of the paper "Shapley Explanation Networks"

Shapley Explanation Networks Implementation of the paper "Shapley Explanation Networks" at ICLR 2021. Note that this repo heavily uses the experimenta

68 Dec 27, 2022
Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

Context-Aware-Healthcare Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs Download

LuChang 9 Dec 26, 2022
Phylogeny Partners

Phylogeny-Partners Two states models Instalation You may need to install the cython, networkx, numpy, scipy package: pip install cython, networkx, num

1 Sep 19, 2022
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Facebook Research 338 Dec 29, 2022
This repository contains the reference implementation for our proposed Convolutional CRFs.

ConvCRF This repository contains the reference implementation for our proposed Convolutional CRFs in PyTorch (Tensorflow planned). The two main entry-

Marvin Teichmann 553 Dec 07, 2022
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa

Zhao Jian 3.1k Jan 04, 2023
Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Faster R-CNN pretrained on VisualGenome This repository modifies maskrcnn-benchmark for object detection and attribute prediction on VisualGenome data

Shizhe Chen 7 Apr 20, 2021