[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

Overview

MixFormer

The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention

PWC

PWC

[Models and Raw results] (Google Driver) [Models and Raw results] (Baidu Driver: hmuv)

MixFormer_Framework

News

[Mar 21, 2022]

  • MixFormer is accepted to CVPR2022.
  • We release Code, models and raw results.

[Mar 29, 2022]

  • Our paper is selected for an oral presentation.

Highlights

New transformer tracking framework

MixFormer is composed of a target-search mixed attention (MAM) based backbone and a simple corner head, yielding a compact tracking pipeline without an explicit integration module.

End-to-end, Positional-embedding-free, multi-feature-aggregation-free

Mixformer is an end-to-end tracking framework without post-processing. Compared with other transformer trackers, MixFormer doesn's use positional embedding, attentional mask and multi-layer feature aggregation strategy.

Strong performance

Tracker VOT2020 (EAO) LaSOT (NP) GOT-10K (AO) TrackingNet (NP)
MixFormer 0.555 79.9 70.7 88.9
ToMP101* (CVPR2022) - 79.2 - 86.4
SBT-large* (CVPR2022) 0.529 - 70.4 -
SwinTrack* (Arxiv2021) - 78.6 69.4 88.2
Sim-L/14* (Arxiv2022) - 79.7 69.8 87.4
STARK (ICCV2021) 0.505 77.0 68.8 86.9
KeepTrack (ICCV2021) - 77.2 - -
TransT (CVPR2021) 0.495 73.8 67.1 86.7
TrDiMP (CVPR2021) - - 67.1 83.3
Siam R-CNN (CVPR2020) - 72.2 64.9 85.4
TREG (Arxiv2021) - 74.1 66.8 83.8

Install the environment

Use the Anaconda

conda create -n mixformer python=3.6
conda activate mixformer
bash install_pytorch17.sh

Data Preparation

Put the tracking datasets in ./data. It should look like:

${MixFormer_ROOT}
 -- data
     -- lasot
         |-- airplane
         |-- basketball
         |-- bear
         ...
     -- got10k
         |-- test
         |-- train
         |-- val
     -- coco
         |-- annotations
         |-- train2017
     -- trackingnet
         |-- TRAIN_0
         |-- TRAIN_1
         ...
         |-- TRAIN_11
         |-- TEST

Set project paths

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Train MixFormer

Training with multiple GPUs using DDP. More details of other training settings can be found at tracking/train_mixformer.sh

# MixFormer
bash tracking/train_mixformer.sh

Test and evaluate MixFormer on benchmarks

  • LaSOT/GOT10k-test/TrackingNet/OTB100/UAV123. More details of test settings can be found at tracking/test_mixformer.sh
bash tracking/test_mixformer.sh
  • VOT2020
    Before evaluating "MixFormer+AR" on VOT2020, please install some extra packages following external/AR/README.md. Also, the VOT toolkit is required to evaluate our tracker. To download and instal VOT toolkit, you can follow this tutorial. For convenience, you can use our example workspaces of VOT toolkit under external/vot20/ by setting trackers.ini.
cd external/vot20/<workspace_dir>
vot evaluate --workspace . MixFormerPython
# generating analysis results
vot analysis --workspace . --nocache

Run MixFormer on your own video

bash tracking/run_video_demo.sh

Compute FLOPs/Params and test speed

bash tracking/profile_mixformer.sh

Visualize attention maps

bash tracking/vis_mixformer_attn.sh

vis_attn

Model Zoo and raw results

The trained models and the raw tracking results are provided in the [Models and Raw results] (Google Driver) or [Models and Raw results] (Baidu Driver: hmuv).

Contact

Yutao Cui: [email protected]

Cheng Jiang: [email protected]

Acknowledgments

  • Thanks for PyTracking Library and STARK Library, which helps us to quickly implement our ideas.
  • We use the implementation of the CvT from the official repo CvT.
Owner
Multimedia Computing Group, Nanjing University
Multimedia Computing Group, Nanjing University
TransCD: Scene Change Detection via Transformer-based Architecture

TransCD: Scene Change Detection via Transformer-based Architecture

wangzhixue 29 Dec 11, 2022
Self-Supervised depth kalilia

Self-Supervised depth kalilia

24 Oct 15, 2022
Accuracy Aligned. Concise Implementation of Swin Transformer

Accuracy Aligned. Concise Implementation of Swin Transformer This repository contains the implementation of Swin Transformer, and the training codes o

FengWang 77 Dec 16, 2022
Serverless proxy for Spark cluster

Hydrosphere Mist Hydrosphere Mist is a serverless proxy for Spark cluster. Mist provides a new functional programming framework and deployment model f

hydrosphere.io 317 Dec 01, 2022
Zsseg.baseline - Zero-Shot Semantic Segmentation

This repo is for our paper A Simple Baseline for Zero-shot Semantic Segmentation

98 Dec 20, 2022
Bot developed in Python that automates races in pegaxy.

español | português About it: This is a fork from pega-racing-bot. This bot, developed in Python, is to automate races in pegaxy. The game developers

4 Apr 08, 2022
This script runs neural style transfer against the provided content image.

Neural Style Transfer Content Style Output Description: This script runs neural style transfer against the provided content image. The content image m

Martynas Subonis 0 Nov 25, 2021
A solution to the 2D Ising model of ferromagnetism, implemented using the Metropolis algorithm

Solving the Ising model on a 2D lattice using the Metropolis Algorithm Introduction The Ising model is a simplified model of ferromagnetism, the pheno

Rohit Prabhu 5 Nov 13, 2022
Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)

MeTAL - Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning (ICCV2021 Oral) Sungyong Baik, Janghoon Choi, Heewon Kim, Dohee Cho, Jaes

Sungyong Baik 44 Dec 29, 2022
This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

GPlearn_finiance_stock_futures_extension This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector

Chengwei <a href=[email protected]"> 189 Dec 25, 2022
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022
Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is our implementation for the paper: Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation Xiaokun Zhang, Bo

Xiaokun Zhang 27 Dec 02, 2022
Attention-guided gan for synthesizing IR images

SI-AGAN Attention-guided gan for synthesizing IR images This repository contains the Tensorflow code for "Pedestrian Gender Recognition by Style Trans

1 Oct 25, 2021
A repo with study material, exercises, examples, etc for Devnet SPAUTO

MPLS in the SDN Era -- DevNet SPAUTO Get right to the study material: Checkout the Wiki! A lab topology based on MPLS in the SDN era book used for 30

Hugo Tinoco 67 Nov 16, 2022
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 02, 2023
VOneNet: CNNs with a Primary Visual Cortex Front-End

VOneNet: CNNs with a Primary Visual Cortex Front-End A family of biologically-inspired Convolutional Neural Networks (CNNs). VOneNets have the followi

The DiCarlo Lab at MIT 99 Dec 22, 2022
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

HCSC: Hierarchical Contrastive Selective Coding This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive

YUANFAN GUO 111 Dec 20, 2022
PyTorch implementations of deep reinforcement learning algorithms and environments

Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env

Petros Christodoulou 4.7k Jan 04, 2023
Explainability for Vision Transformers (in PyTorch)

Explainability for Vision Transformers (in PyTorch) This repository implements methods for explainability in Vision Transformers

Jacob Gildenblat 442 Jan 04, 2023