Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Overview

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

This project hosts the code for implementing the DenseCL algorithm for self-supervised representation learning.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training,
Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li
In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2021
arXiv preprint (arXiv 2011.09157)

highlights2

Highlights

  • Boosting dense predictions: DenseCL pre-trained models largely benefit dense prediction tasks including object detection and semantic segmentation (up to +2% AP and +3% mIoU).
  • Simple implementation: The core part of DenseCL can be implemented in 10 lines of code, thus being easy to use and modify.
  • Flexible usage: DenseCL is decoupled from the data pre-processing, thus enabling fast and flexible training while being agnostic about what kind of augmentation is used and how the images are sampled.
  • Efficient training: Our method introduces negligible computation overhead (only <1% slower) compared to the baseline method.

highlights

Updates

  • Code and pre-trained models of DenseCL are released. (02/03/2021)

Installation

Please refer to INSTALL.md for installation and dataset preparation.

Models

For your convenience, we provide the following pre-trained models on COCO or ImageNet.

pre-train method pre-train dataset backbone #epoch training time VOC det VOC seg Link
Supervised ImageNet ResNet-50 - - 54.2 67.7 download
MoCo-v2 COCO ResNet-50 800 1.0d 54.7 64.5 download
DenseCL COCO ResNet-50 800 1.0d 56.7 67.5 download
DenseCL COCO ResNet-50 1600 2.0d 57.2 68.0 download
MoCo-v2 ImageNet ResNet-50 200 2.3d 57.0 67.5 download
DenseCL ImageNet ResNet-50 200 2.3d 58.7 69.4 download
DenseCL ImageNet ResNet-101 200 4.3d 61.3 74.1 download

Note:

  • The metrics for VOC det and seg are AP (COCO-style) and mIoU. The results are averaged over 5 trials.
  • The training time is measured on 8 V100 GPUs.
  • See our paper for more results on different benchmarks.

Usage

Training

./tools/dist_train.sh configs/selfsup/densecl/densecl_coco_800ep.py 8

Extracting Backbone Weights

WORK_DIR=work_dirs/selfsup/densecl/densecl_coco_800ep/
CHECKPOINT=${WORK_DIR}/epoch_800.pth
WEIGHT_FILE=${WORK_DIR}/extracted_densecl_coco_800ep.pth

python tools/extract_backbone_weights.py ${CHECKPOINT} ${WEIGHT_FILE}

Transferring to Object Detection and Segmentation

Please refer to README.md for transferring to object detection and semantic segmentation.

Tips

  • After extracting the backbone weights, the model can be used to replace the original ImageNet pre-trained model as initialization for many dense prediction tasks.
  • If your machine has a slow data loading issue, especially for ImageNet, your are suggested to convert ImageNet to lmdb format through folder2lmdb_imagenet.py, and use this config for training.

Acknowledgement

We would like to thank the OpenSelfSup for its open-source project and PyContrast for its detection evaluation configs.

Citations

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follow.

@inproceedings{wang2020DenseCL,
  title={Dense Contrastive Learning for Self-Supervised Visual Pre-Training},
  author={Wang, Xinlong and Zhang, Rufeng and Shen, Chunhua and Kong, Tao and Li, Lei},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}
Owner
Xinlong Wang
Xinlong Wang
Consensus score for tripadvisor

ContripScore ContripScore is essentially a score that combines an Internet platform rating and a consensus rating from sentiment analysis (For instanc

Pepe 1 Jan 13, 2022
Anomaly Localization in Model Gradients Under Backdoor Attacks Against Federated Learning

Federated_Learning This repo provides a federated learning framework that allows to carry out backdoor attacks under varying conditions. This is a ker

Arçelik ARGE Açık Kaynak Yazılım Organizasyonu 0 Nov 30, 2021
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r

170 Jan 03, 2023
An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

AlphaCare Background AlphaCare is a work-in-progress, open-source Deep Learning Engine for Healthcare that aims to treat and prevent major diseases. T

Siraj Raval 44 Nov 05, 2022
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te

Minh-Khoi Pham 5 Nov 05, 2022
Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling Code for the paper: Greg Ver Steeg and Aram Galstyan. "Hamiltonian Dynamics with N

Greg Ver Steeg 25 Mar 14, 2022
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
A PyTorch Implementation of Neural IMage Assessment

NIMA: Neural IMage Assessment This is a PyTorch implementation of the paper NIMA: Neural IMage Assessment (accepted at IEEE Transactions on Image Proc

yunxiaos 418 Dec 29, 2022
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point

Fangzhou Hong 112 Dec 23, 2022
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Rowel Atienza 198 Dec 27, 2022
Arquitetura e Desenho de Software.

S203 Este é um repositório dedicado às aulas de Arquitetura e Desenho de Software, cuja sigla é "S203". E agora, José? Como não tenho muito a falar aq

Fabio 7 Oct 23, 2021
Practical tutorials and labs for TensorFlow used by Nvidia, FFN, CNN, RNN, Kaggle, AE

TensorFlow Tutorial - used by Nvidia Learn TensorFlow from scratch by examples and visualizations with interactive jupyter notebooks. Learn to compete

Alexander R Johansen 1.9k Dec 19, 2022
The official implementation of the research paper "DAG Amendment for Inverse Control of Parametric Shapes"

DAG Amendment for Inverse Control of Parametric Shapes This repository is the official Blender implementation of the paper "DAG Amendment for Inverse

Elie Michel 157 Dec 26, 2022
No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

Dae-Young Song 26 Jan 04, 2023
Tensor-Based Quantum Machine Learning

TensorLy_Quantum TensorLy-Quantum is a Python library for Tensor-Based Quantum Machine Learning that builds on top of TensorLy and PyTorch. Website: h

TensorLy 85 Dec 03, 2022
TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.

DeepTCN TensorFlow TensorFlow (Python) implementation of multivariate time series forecasting model introduced in Chen, Y., Kang, Y., Chen, Y., & Wang

Flavia Giammarino 21 Dec 19, 2022
Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation

Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation Introduction This is a PyTorch

XMed-Lab 30 Sep 23, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022
Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"

MotionCLIP Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space". Please visit our webpage for mor

Guy Tevet 173 Dec 26, 2022
La source de mon module 'pyfade' disponible sur Pypi.

Version: 1.2 Introduction Pyfade est un module permettant de créer des dégradés colorés. Il vous permettra de changer chaque ligne de votre texte par

Billy 20 Sep 12, 2021