Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Overview

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Cao, Stephen Lin and Han Hu.

This repo is an official implementation of "Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning" on PyTorch.

Introduction

PixPro (pixel-to-propagation) is an unsupervised visual feature learning approach by leveraging pixel-level pretext tasks. The learnt feature can be well transferred to downstream dense prediction tasks such as object detection and semantic segmentation. PixPro achieves the best transferring performance on Pascal VOC object detection (60.2 AP using C4) and COCO object detection (41.4 / 40.5 mAP using FPN / C4) with a ResNet-50 backbone.

An illustration of the proposed PixPro method.

Architecture of the PixContrast and PixPro methods.

Citation

@article{xie2020propagate,
  title={Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning},
  author={Xie, Zhenda and Lin, Yutong and Zhang, Zheng and Cao, Yue and Lin, Stephen and Hu, Han},
  conference={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Main Results

PixPro pre-trained models

Epochs Arch Instance Branch Download
100 ResNet-50 script | model
400 ResNet-50 script | model
100 ResNet-50 ✔️ -
400 ResNet-50 ✔️ -

Pascal VOC object detection

Faster-RCNN with C4

Method Epochs Arch AP AP50 AP75 Download
Scratch - ResNet-50 33.8 60.2 33.1 -
Supervised 100 ResNet-50 53.5 81.3 58.8 -
MoCo 200 ResNet-50 55.9 81.5 62.6 -
SimCLR 1000 ResNet-50 56.3 81.9 62.5 -
MoCo v2 800 ResNet-50 57.6 82.7 64.4 -
InfoMin 200 ResNet-50 57.6 82.7 64.6 -
InfoMin 800 ResNet-50 57.5 82.5 64.0 -
PixPro (ours) 100 ResNet-50 58.8 83.0 66.5 config | model
PixPro (ours) 400 ResNet-50 60.2 83.8 67.7 config | model

COCO object detection

Mask-RCNN with FPN

Method Epochs Arch Schedule bbox AP mask AP Download
Scratch - ResNet-50 1x 32.8 29.9 -
Supervised 100 ResNet-50 1x 39.7 35.9 -
MoCo 200 ResNet-50 1x 39.4 35.6 -
SimCLR 1000 ResNet-50 1x 39.8 35.9 -
MoCo v2 800 ResNet-50 1x 40.4 36.4 -
InfoMin 200 ResNet-50 1x 40.6 36.7 -
InfoMin 800 ResNet-50 1x 40.4 36.6 -
PixPro (ours) 100 ResNet-50 1x 40.8 36.8 config | model
PixPro (ours) 100* ResNet-50 1x 41.3 37.1 -
PixPro (ours) 400* ResNet-50 1x 41.4 37.4 -

* Indicates methods with instance branch.

Mask-RCNN with C4

Method Epochs Arch Schedule bbox AP mask AP Download
Scratch - ResNet-50 1x 26.4 29.3 -
Supervised 100 ResNet-50 1x 38.2 33.3 -
MoCo 200 ResNet-50 1x 38.5 33.6 -
SimCLR 1000 ResNet-50 1x 38.4 33.6 -
MoCo v2 800 ResNet-50 1x 39.5 34.5 -
InfoMin 200 ResNet-50 1x 39.0 34.1 -
InfoMin 800 ResNet-50 1x 38.8 33.8 -
PixPro (ours) 100 ResNet-50 1x 40.0 34.8 config | model
PixPro (ours) 400 ResNet-50 1x 40.5 35.3 config | model

Getting started

Requirements

At present, we have not checked the compatibility of the code with other versions of the packages, so we only recommend the following configuration.

  • Python 3.7
  • PyTorch == 1.4.0
  • Torchvision == 0.5.0
  • CUDA == 10.1
  • Other dependencies

Installation

We recommand using conda env to setup the experimental environments.

# Create environment
conda create -n PixPro python=3.7 -y
conda activate PixPro

# Install PyTorch & Torchvision
conda install pytorch=1.4.0 cudatoolkit=10.1 torchvision -c pytorch -y

# Install apex
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ..

# Clone repo
git clone https://github.com/zdaxie/PixPro ./PixPro
cd ./PixPro

# Create soft link for data
mkdir data
ln -s ${ImageNet-Path} ./data/imagenet

# Install other requirements
pip install -r requirements.txt

Pretrain with PixPro

# Train with PixPro base for 100 epochs.
./tools/pixpro_base_r50_100ep.sh

Transfer to Pascal VOC or COCO object detection

# Convert a pre-trained PixPro model to detectron2's format
cd transfer/detection
python convert_pretrain_to_d2.py ${Input-Checkpoint(.pth)} ./output.pkl  

# Install Detectron2
python -m pip install detectron2==0.2.1 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.4/index.html

# Create soft link for data
mkdir datasets
ln -s ${Pascal-VOC-Path}/VOC2007 ./datasets/VOC2007
ln -s ${Pascal-VOC-Path}/VOC2012 ./datasets/VOC2012
ln -s ${COCO-Path} ./datasets/coco

# Train detector with pre-trained PixPro model
# 1. Train Faster-RCNN with Pascal-VOC
python train_net.py --config-file configs/Pascal_VOC_R_50_C4_24k_PixPro.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
# 2. Train Mask-RCNN-FPN with COCO
python train_net.py --config-file configs/COCO_R_50_FPN_1x_PixPro.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
# 3. Train Mask-RCNN-C4 with COCO
python train_net.py --config-file configs/COCO_R_50_C4_1x_PixPro.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl

# Test detector with provided fine-tuned model
python train_net.py --config-file configs/Pascal_VOC_R_50_C4_24k_PixPro.yaml --num-gpus 8 --eval-only \
  MODEL.WEIGHTS ./pixpro_base_r50_100ep_voc_md5_ec2dfa63.pth

More models and logs will be released!

Acknowledgement

Our testbed builds upon several existing publicly available codes. Specifically, we have modified and integrated the following code into this project:

Contributing to the project

Any pull requests or issues are welcomed.

Regulatory Instruments for Fair Personalized Pricing.

Fair pricing Source code for WWW 2022 paper Regulatory Instruments for Fair Personalized Pricing. Installation Requirements Linux with Python = 3.6 p

Renzhe Xu 6 Oct 26, 2022
Mitsuba 2: A Retargetable Forward and Inverse Renderer

Mitsuba Renderer 2 Documentation Mitsuba 2 is a research-oriented rendering system written in portable C++17. It consists of a small set of core libra

Mitsuba Physically Based Renderer 2k Jan 07, 2023
SeqAttack: a framework for adversarial attacks on token classification models

A framework for adversarial attacks against token classification models

Walter 23 Nov 25, 2022
Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Kaggle Feedback Prize - Evaluating Student Writing 15th solution First of all, I would like to thank the excellent notebooks and discussions from http

Lingyuan Zhang 6 Mar 24, 2022
Dynamic Head: Unifying Object Detection Heads with Attentions

Dynamic Head: Unifying Object Detection Heads with Attentions dyhead_video.mp4 This is the official implementation of CVPR 2021 paper "Dynamic Head: U

Microsoft 550 Dec 21, 2022
MultiTaskLearning - Multi Task Learning for 3D segmentation

Multi Task Learning for 3D segmentation Perception stack of an Autonomous Drivin

2 Sep 22, 2022
A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Jun-Yan Zhu 27 Aug 08, 2022
GBIM(Gesture-Based Interaction map)

手势交互地图 GBIM(Gesture-Based Interaction map),基于视觉深度神经网络的交互地图,通过电脑摄像头观察使用者的手势变化,进而控制地图进行简单的交互。网络使用PaddleX提供的轻量级模型PPYOLO Tiny以及MobileNet V3 small,使得整个模型大小约10MB左右,即使在CPU下也能快速定位和识别手势。

8 Feb 10, 2022
Learning from Synthetic Shadows for Shadow Detection and Removal [Inoue+, IEEE TCSVT 2020].

Learning from Synthetic Shadows for Shadow Detection and Removal (IEEE TCSVT 2020) Overview This repo is for the paper "Learning from Synthetic Shadow

Naoto Inoue 67 Dec 28, 2022
Reimplement of SimSwap training code

SimSwap-train Reimplement of SimSwap training code Instructions 1.Environment Preparation (1)Refer to the README document of SIMSWAP to configure the

seeprettyface.com 111 Dec 31, 2022
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Microsoft 408 Dec 30, 2022
A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries.

Yolo-Powered-Detector A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries

Luke Wilson 1 Dec 03, 2021
PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

DirectCLR DirectCLR is a simple contrastive learning model for visual representation learning. It does not require a trainable projector as SimCLR. It

Meta Research 49 Dec 21, 2022
Does Oversizing Improve Prosumer Profitability in a Flexibility Market? - A Sensitivity Analysis using PV-battery System

Does Oversizing Improve Prosumer Profitability in a Flexibility Market? - A Sensitivity Analysis using PV-battery System The possibilities to involve

Babu Kumaran Nalini 0 Nov 19, 2021
CONetV2: Efficient Auto-Channel Size Optimization for CNNs

CONetV2: Efficient Auto-Channel Size Optimization for CNNs Exciting News! CONetV2: Efficient Auto-Channel Size Optimization for CNNs has been accepted

Mahdi S. Hosseini 3 Dec 13, 2021
A simple software for capturing human body movements using the Kinect camera.

KinectMotionCapture A simple software for capturing human body movements using the Kinect camera. The software can seamlessly save joints and bones po

Aleksander Palkowski 5 Aug 13, 2022
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue

Realtime Unsupervised Depth Estimation from an Image This is the caffe implementation of our paper "Unsupervised CNN for single view depth estimation:

Ravi Garg 227 Nov 28, 2022
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

ZJUNLP 68 Dec 28, 2022
Official implementation of VQ-Diffusion

Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis

Microsoft 592 Jan 03, 2023
TakeInfoatNistforICS - Take Information in NIST NVD for ICS

Take Information in NIST NVD for ICS This project developed with Python. When yo

5 Sep 05, 2022