git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Overview

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li

Accepted by CVPR 2021 (Oral). [Paper Link]

This repository includes Python (PyTorch) implementation of the TrDiMP and TrSiam trackers, to appear in CVPR 2021.

Abstract

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. In this work, we bridge the individual video frames and explore the temporal contexts across them via a transformer architecture for robust object tracking. Different from classic usage of the transformer in natural language processing tasks, we separate its encoder and decoder into two parallel branches and carefully design them within the Siamese-like tracking pipelines. The transformer encoder promotes the target templates via attention-based feature reinforcement, which benefits the high-quality tracking model generation. The transformer decoder propagates the tracking cues from previous templates to the current frame, which facilitates the object searching process. Our transformer-assisted tracking framework is neat and trained in an end-to-end manner. With the proposed transformer, a simple Siamese matching approach is able to outperform the current top-performing trackers. By combining our transformer with the recent discriminative tracking pipeline, our method sets several new state-of-the-art records on prevalent tracking benchmarks.

Tracking Results and Pretrained Model

Tracking results: the raw results of TrDiMP/TrSiam on 7 benchmarks including OTB, UAV, NFS, VOT2018, GOT-10k, TrackingNet, and LaSOT can be found here.

Pretrained model: please download the TrDiMP model and put it in the pytracking/networks folder.

TrDiMP and TrSiam share the same model. The main difference between TrDiMP and TrSiam lies in the tracking model generation. TrSiam does not utilize the background information and simply crops the target/foreground area to generate the tracking model, which can be regarded as the initialization step of TrDiMP.

Environment Setup

Clone the GIT repository.

git clone https://github.com/594422814/TransformerTrack.git

Clone the submodules.

In the repository directory, run the commands:

git submodule update --init  

Install dependencies

Run the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here pytracking).

bash install.sh conda_install_path pytracking

This script will also download the default networks and set-up the environment.

Note: The install script has been tested on an Ubuntu 18.04 system. In case of issues, check the detailed installation instructions.

Our code is based on the PyTracking framework. For more details, please refer to PyTracking.

Training the TrDiMP/TrSiam Model

Please refer to the README in the ltr folder.

Testing the TrDiMP/TrSiam Tracker

Please refer to the README in the pytracking folder. As shown in pytracking/README.md, you can either use this PyTracking toolkit or GOT-10k toolkit to reproduce the tracking results.

Citation

If you find this work useful for your research, please consider citing our work:

@inproceedings{Wang_2021_Transformer,
    title={Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking},
    author={Wang, Ning and Zhou, Wengang and Wang, Jie and Li, Houqiang},
    booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}

Acknowledgment

Our transformer-assisted tracker is based on PyTracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing this great framework.

Contact

If you have any questions, please feel free to contact [email protected]

Comments
  • some qusetions on testing VOT18

    some qusetions on testing VOT18

    when i use GOT10K_VOT.py to test VOT18, some errors will happen in the ./pytracking/tracker/trdimp.py, it seems some parameters do not match the test of vot18. But i use GOT10K_GOT.py to test GOT_10k, this errors didn't happen, because ./pytracking/tracker/trdimp_for_GOT.py works. Do you have specialied trdimp_for_VOT.py that uses on testing VOT18? Your prompt attention to my question would be highly appreciated!

    opened by yuyudiandian 4
  • Difference between TrSiam and TrDiMP

    Difference between TrSiam and TrDiMP

    Hi,

    Thanks for your good work!

    When I was reading your paper and your code I found that during the training phase, there is no difference between TrSiam and TrDimp. I want to to know if the difference between thiese two algorithms only occurs during the tracking phase? For TrSiam, you also used the DCF during the training phase. Is this true?

    Thanks

    opened by YanWanquan 3
  • Training

    Training

    In your code I found that you import Lasot, Got10k, TrackingNet and MSCOCOSeq. I want to know if you used all these datasets for training.

    In your paper, you said the batch size you used is 36 image pairs and there is total 1500 iterations per epoch. However, 1500*36=1,8000 which is much smaller thant the number of images in CoCo dataset. I remember batch_size * iterations = dataset_sizse. I am a newer in the tracking field, and just a little confused about this. Thanks.

    opened by YanWanquan 3
  • what is your hardware for trainning?

    what is your hardware for trainning?

    what is your hardware for trainning? Super_dimp use one TATIAN X for training. And your traning setting is slight different, I wonder what's the reason。

    opened by Lightning980729 3
  • No matching checkpoint file found

    No matching checkpoint file found

    I run with: python run_tracker.py trdimp trdimp --sequence Biker

    The reported error is : "No matching checkpoint file found"

    How should I repair this? or How should I set the checkpoint file?

    Thanks a lot.

    opened by gyc9709 3
  • some questions on testing

    some questions on testing

    I did some changes on the structure of code,but when testing GOT or OTB ,it will happen “cuda memory is not enough ”。Same question not happened on testing VOT18,so do you have some suggestions on this question?Or how to reduce the the cuda memory on testing dataset?Thank you.

    opened by yuyudiandian 2
  • no matching checkpoint file found

    no matching checkpoint file found

    Hi, when I want to run python run_training dimp transformer_dimp, it always shows no matching checkpoint file found. Does this model need a checkpoint file before training? If yes, where can I download this file?

    When I run to the 52 line of ltr_trainer.py file: for i, data in enumerate(loader,1), it shows an error: out of memory. My computer has 16G memory, and my GPU has 32G memory, so I think it is large enough to run the model. Can you give any suggestions about this problem? Thanks.

    opened by YanWanquan 2
  • Question about FFN

    Question about FFN

    Thanks for your work! I noticed that in the paper, you claimed that "To achieve a good balance of speed and performance, we slim the classic transformer by omitting the fully-connected feed-forward layers and maintaining the lightweight single-head attention", and it seemed that you also do not use the FFN layer in your code. I'm wondering how about the performance regardless of speed if you use the classic transformer including FFN ?

    opened by 3bobo 2
  • 使用got10k的测试工具报错

    使用got10k的测试工具报错

    image segmentation_dir: /trdimp/trdimp_vot Files already downloaded. Running tracker GOT_Tracker on VOT... Running supervised experiment... --Sequence 1/60: ants1 Repetition: 1 Traceback (most recent call last): File "GOT10k_VOT.py", line 45, in experiment.run(tracker, visualize=False) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 71, in run self.run_supervised(tracker, visualize) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 126, in run_supervised tracker.init(frame, anno_rects[0]) File "GOT10k_VOT.py", line 32, in init self.tracker.initialize(image, box) File "../pytracking/tracker/trdimp/trdimp.py", line 55, in initialize state = info['init_bbox'] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

    看上去是版本导致的错误? 请问作者有遇到过或者有修改的思路吗

    opened by 1145284121 1
  • 训练和测试运行不报错,程序停止了

    训练和测试运行不报错,程序停止了

    您好,在使用您的代码过程中,我使用了自己的数据作为训练和测试。但是在训练或者测试过程中出现如下提示,但是不报错,程序也不在运行了。

    /home/miniconda3/envs/pytracking/bin/python /home/TransformerTrack-main/pytracking/run_tracker.py trdimp trsiam --dataset eotb --sequence val
    Evaluating    1 trackers on    31 sequences
    Tracker: trdimp trsiam None ,  Sequence: test_seq_474
    /home/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/functional.py:3060: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
      warnings.warn("Default upsampling behavior when mode={} is changed "
    Using /home/.cache/torch_extensions as PyTorch extensions root...
    

    因为DIMP代码机制的问题,在训练的时候,如果ctrl+c代码则会重新运行,且运行正常。但是测试的时候不会。请问这是什么原因呢?

    opened by Jee-King 1
  • Some questions about train setting?

    Some questions about train setting?

    Some questions about your baseline? It seems that you use the same parameter of SuperDiMP, but not DiMP. Based on the SuperDiMP does achieves the performance described in your paper, but the performance of DiMP is doubtful. However, your paper is elaborated from the perspective of temporal feature, which is novel and great.

    opened by yxxxqqq 1
  • Question about source code of Transformer part

    Question about source code of Transformer part

    As illustrated in paper, Encoder is used to ehanced template feature(train_feat in dimp) only, and Decoder is used to produce decoded search feature. But in the transformer's forward function, decoder is also used on train_feat, which is not described in the paper. Could you please explain why?

    opened by ChuzzZz 0
  • linear transformation for key and query

    linear transformation for key and query

    I noticed that linear transformations used to reduce the dimension of key and query share same weight. Is it out of computaition consideration or used different weight degrades performance?

    opened by ChuzzZz 0
  • Problems in Training process

    Problems in Training process

    Hi, when I run python run_training dimp transformer_dimp, it shows ‘No matching checkpoint file found’ and ‘ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])’. The batch_size is set to 6. I don’t know why it change to 1 in the training process. Do you have any solutions to this problem? Thanks very much!

    The error message reported during the training process is shown below.

    _Training: dimp transformer_dimp No matching checkpoint file found Using /tmp/torch_extensions as PyTorch extensions root...Using /tmp/torch_extensions as PyTorch extensions root...

    Detected CUDA files, patching ldflags Emitting ninja build file /tmp/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Using /tmp/torch_extensions as PyTorch extensions root... No modifications detected for re-loaded extension module _prroi_pooling, skipping build step... Loading extension module _prroi_pooling... ninja: no work to do. Loading extension module _prroi_pooling... Loading extension module _prroi_pooling... Training crashed at epoch 1 Traceback for the error! Traceback (most recent call last): File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/base_trainer.py", line 70, in train self.train_epoch() File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 80, in train_epoch self.cycle_dataset(loader) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset loss, stats = self.actor(data) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/actors/tracking.py", line 97, in call test_proposals=data['test_proposals']) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in replica 0 on device 0. Original Traceback (most recent call last): File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/tracking/dimpnet.py", line 75, in forward iou_pred = self.bb_regressor(train_feat_iou, test_feat_iou, train_bb, test_proposals) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 86, in forward modulation = self.get_modulation(feat1, bb1) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 162, in get_modulation fc3_r = self.fc3_1r(roi3r) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 106, in forward exponential_average_factor, self.eps) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1919, in batch_norm _verify_batch_size(input.size()) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1902, in verify_batch_size raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

    opened by Chenlulu1993 2
Owner
NingWang
PhD student in University of Science and Technology of China (USTC)
NingWang
GRF: Learning a General Radiance Field for 3D Representation and Rendering

GRF: Learning a General Radiance Field for 3D Representation and Rendering [Paper] [Video] GRF: Learning a General Radiance Field for 3D Representatio

Alex Trevithick 243 Dec 29, 2022
Create time-series datacubes for supervised machine learning with ICEYE SAR images.

ICEcube is a Python library intended to help organize SAR images and annotations for supervised machine learning applications. The library generates m

ICEYE Ltd 65 Jan 03, 2023
Deepfake Scanner by Deepware.

Deepware Scanner (CLI) This repository contains the command-line deepfake scanner tool with the pre-trained models that are currently used at deepware

deepware 110 Jan 02, 2023
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

SAFA: Structure Aware Face Animation (3DV2021) Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation. Getting Started

QiulinW 122 Dec 23, 2022
Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning" This is the code for the paper Solving Graph-based Public Goo

Victor-Alexandru Darvariu 3 Dec 05, 2022
Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Exposure: A White-Box Photo Post-Processing Framework ACM Transactions on Graphics (presented at SIGGRAPH 2018) Yuanming Hu1,2, Hao He1,2, Chenxi Xu1,

Yuanming Hu 719 Dec 29, 2022
Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks.

pyradiomics v3.0.1 Build Status Linux macOS Windows Radiomics feature extraction in Python This is an open-source python package for the extraction of

Artificial Intelligence in Medicine (AIM) Program 842 Dec 28, 2022
🛠️ Tools for Transformers compression using Lightning ⚡

Bert-squeeze is a repository aiming to provide code to reduce the size of Transformer-based models or decrease their latency at inference time.

Jules Belveze 66 Dec 11, 2022
Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

CDN Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection". Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Mia

71 Dec 14, 2022
Tutorials and implementations for "Self-normalizing networks"

Self-Normalizing Networks Tutorials and implementations for "Self-normalizing networks"(SNNs) as suggested by Klambauer et al. (arXiv pre-print). Vers

Institute of Bioinformatics, Johannes Kepler University Linz 1.6k Jan 07, 2023
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
Efficiently Disentangle Causal Representations

Efficiently Disentangle Causal Representations Install dependency pip install -r requirements.txt Main experiments Causality direction prediction cd

4 Apr 01, 2022
Sinkformers: Transformers with Doubly Stochastic Attention

Code for the paper : "Sinkformers: Transformers with Doubly Stochastic Attention" Paper You will find our paper here. Compat This package has been dev

Michael E. Sander 31 Dec 29, 2022
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a-Service". Being busy recently, the code in this repo and this tutoria

Tianxiang Sun 149 Jan 04, 2023
A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

COIN_FLIPPY ##This is a simple example package. You can use Github-flavored Markdown to write your content. Coinflippy A coin flip game in which you c

2 Dec 26, 2021
DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

Generative Image Inpainting An open source framework for generative image inpainting task, with the support of Contextual Attention (CVPR 2018) and Ga

2.9k Dec 16, 2022
arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

Andrej 671 Dec 31, 2022
TensorFlow 2 implementation of the Yahoo Open-NSFW model

TensorFlow 2 implementation of the Yahoo Open-NSFW model

Bosco Yung 101 Jan 01, 2023
Automatically creates genre collections for your Plex media

Plex Auto Genres Plex Auto Genres is a simple script that will add genre collection tags to your media making it much easier to search for genre speci

Shane Israel 63 Dec 31, 2022