git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Overview

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li

Accepted by CVPR 2021 (Oral). [Paper Link]

This repository includes Python (PyTorch) implementation of the TrDiMP and TrSiam trackers, to appear in CVPR 2021.

Abstract

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. In this work, we bridge the individual video frames and explore the temporal contexts across them via a transformer architecture for robust object tracking. Different from classic usage of the transformer in natural language processing tasks, we separate its encoder and decoder into two parallel branches and carefully design them within the Siamese-like tracking pipelines. The transformer encoder promotes the target templates via attention-based feature reinforcement, which benefits the high-quality tracking model generation. The transformer decoder propagates the tracking cues from previous templates to the current frame, which facilitates the object searching process. Our transformer-assisted tracking framework is neat and trained in an end-to-end manner. With the proposed transformer, a simple Siamese matching approach is able to outperform the current top-performing trackers. By combining our transformer with the recent discriminative tracking pipeline, our method sets several new state-of-the-art records on prevalent tracking benchmarks.

Tracking Results and Pretrained Model

Tracking results: the raw results of TrDiMP/TrSiam on 7 benchmarks including OTB, UAV, NFS, VOT2018, GOT-10k, TrackingNet, and LaSOT can be found here.

Pretrained model: please download the TrDiMP model and put it in the pytracking/networks folder.

TrDiMP and TrSiam share the same model. The main difference between TrDiMP and TrSiam lies in the tracking model generation. TrSiam does not utilize the background information and simply crops the target/foreground area to generate the tracking model, which can be regarded as the initialization step of TrDiMP.

Environment Setup

Clone the GIT repository.

git clone https://github.com/594422814/TransformerTrack.git

Clone the submodules.

In the repository directory, run the commands:

git submodule update --init  

Install dependencies

Run the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here pytracking).

bash install.sh conda_install_path pytracking

This script will also download the default networks and set-up the environment.

Note: The install script has been tested on an Ubuntu 18.04 system. In case of issues, check the detailed installation instructions.

Our code is based on the PyTracking framework. For more details, please refer to PyTracking.

Training the TrDiMP/TrSiam Model

Please refer to the README in the ltr folder.

Testing the TrDiMP/TrSiam Tracker

Please refer to the README in the pytracking folder. As shown in pytracking/README.md, you can either use this PyTracking toolkit or GOT-10k toolkit to reproduce the tracking results.

Citation

If you find this work useful for your research, please consider citing our work:

@inproceedings{Wang_2021_Transformer,
    title={Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking},
    author={Wang, Ning and Zhou, Wengang and Wang, Jie and Li, Houqiang},
    booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}

Acknowledgment

Our transformer-assisted tracker is based on PyTracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing this great framework.

Contact

If you have any questions, please feel free to contact [email protected]

Comments
  • some qusetions on testing VOT18

    some qusetions on testing VOT18

    when i use GOT10K_VOT.py to test VOT18, some errors will happen in the ./pytracking/tracker/trdimp.py, it seems some parameters do not match the test of vot18. But i use GOT10K_GOT.py to test GOT_10k, this errors didn't happen, because ./pytracking/tracker/trdimp_for_GOT.py works. Do you have specialied trdimp_for_VOT.py that uses on testing VOT18? Your prompt attention to my question would be highly appreciated!

    opened by yuyudiandian 4
  • Difference between TrSiam and TrDiMP

    Difference between TrSiam and TrDiMP

    Hi,

    Thanks for your good work!

    When I was reading your paper and your code I found that during the training phase, there is no difference between TrSiam and TrDimp. I want to to know if the difference between thiese two algorithms only occurs during the tracking phase? For TrSiam, you also used the DCF during the training phase. Is this true?

    Thanks

    opened by YanWanquan 3
  • Training

    Training

    In your code I found that you import Lasot, Got10k, TrackingNet and MSCOCOSeq. I want to know if you used all these datasets for training.

    In your paper, you said the batch size you used is 36 image pairs and there is total 1500 iterations per epoch. However, 1500*36=1,8000 which is much smaller thant the number of images in CoCo dataset. I remember batch_size * iterations = dataset_sizse. I am a newer in the tracking field, and just a little confused about this. Thanks.

    opened by YanWanquan 3
  • what is your hardware for trainning?

    what is your hardware for trainning?

    what is your hardware for trainning? Super_dimp use one TATIAN X for training. And your traning setting is slight different, I wonder what's the reason。

    opened by Lightning980729 3
  • No matching checkpoint file found

    No matching checkpoint file found

    I run with: python run_tracker.py trdimp trdimp --sequence Biker

    The reported error is : "No matching checkpoint file found"

    How should I repair this? or How should I set the checkpoint file?

    Thanks a lot.

    opened by gyc9709 3
  • some questions on testing

    some questions on testing

    I did some changes on the structure of code,but when testing GOT or OTB ,it will happen “cuda memory is not enough ”。Same question not happened on testing VOT18,so do you have some suggestions on this question?Or how to reduce the the cuda memory on testing dataset?Thank you.

    opened by yuyudiandian 2
  • no matching checkpoint file found

    no matching checkpoint file found

    Hi, when I want to run python run_training dimp transformer_dimp, it always shows no matching checkpoint file found. Does this model need a checkpoint file before training? If yes, where can I download this file?

    When I run to the 52 line of ltr_trainer.py file: for i, data in enumerate(loader,1), it shows an error: out of memory. My computer has 16G memory, and my GPU has 32G memory, so I think it is large enough to run the model. Can you give any suggestions about this problem? Thanks.

    opened by YanWanquan 2
  • Question about FFN

    Question about FFN

    Thanks for your work! I noticed that in the paper, you claimed that "To achieve a good balance of speed and performance, we slim the classic transformer by omitting the fully-connected feed-forward layers and maintaining the lightweight single-head attention", and it seemed that you also do not use the FFN layer in your code. I'm wondering how about the performance regardless of speed if you use the classic transformer including FFN ?

    opened by 3bobo 2
  • 使用got10k的测试工具报错

    使用got10k的测试工具报错

    image segmentation_dir: /trdimp/trdimp_vot Files already downloaded. Running tracker GOT_Tracker on VOT... Running supervised experiment... --Sequence 1/60: ants1 Repetition: 1 Traceback (most recent call last): File "GOT10k_VOT.py", line 45, in experiment.run(tracker, visualize=False) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 71, in run self.run_supervised(tracker, visualize) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 126, in run_supervised tracker.init(frame, anno_rects[0]) File "GOT10k_VOT.py", line 32, in init self.tracker.initialize(image, box) File "../pytracking/tracker/trdimp/trdimp.py", line 55, in initialize state = info['init_bbox'] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

    看上去是版本导致的错误? 请问作者有遇到过或者有修改的思路吗

    opened by 1145284121 1
  • 训练和测试运行不报错,程序停止了

    训练和测试运行不报错,程序停止了

    您好,在使用您的代码过程中,我使用了自己的数据作为训练和测试。但是在训练或者测试过程中出现如下提示,但是不报错,程序也不在运行了。

    /home/miniconda3/envs/pytracking/bin/python /home/TransformerTrack-main/pytracking/run_tracker.py trdimp trsiam --dataset eotb --sequence val
    Evaluating    1 trackers on    31 sequences
    Tracker: trdimp trsiam None ,  Sequence: test_seq_474
    /home/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/functional.py:3060: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
      warnings.warn("Default upsampling behavior when mode={} is changed "
    Using /home/.cache/torch_extensions as PyTorch extensions root...
    

    因为DIMP代码机制的问题,在训练的时候,如果ctrl+c代码则会重新运行,且运行正常。但是测试的时候不会。请问这是什么原因呢?

    opened by Jee-King 1
  • Some questions about train setting?

    Some questions about train setting?

    Some questions about your baseline? It seems that you use the same parameter of SuperDiMP, but not DiMP. Based on the SuperDiMP does achieves the performance described in your paper, but the performance of DiMP is doubtful. However, your paper is elaborated from the perspective of temporal feature, which is novel and great.

    opened by yxxxqqq 1
  • Question about source code of Transformer part

    Question about source code of Transformer part

    As illustrated in paper, Encoder is used to ehanced template feature(train_feat in dimp) only, and Decoder is used to produce decoded search feature. But in the transformer's forward function, decoder is also used on train_feat, which is not described in the paper. Could you please explain why?

    opened by ChuzzZz 0
  • linear transformation for key and query

    linear transformation for key and query

    I noticed that linear transformations used to reduce the dimension of key and query share same weight. Is it out of computaition consideration or used different weight degrades performance?

    opened by ChuzzZz 0
  • Problems in Training process

    Problems in Training process

    Hi, when I run python run_training dimp transformer_dimp, it shows ‘No matching checkpoint file found’ and ‘ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])’. The batch_size is set to 6. I don’t know why it change to 1 in the training process. Do you have any solutions to this problem? Thanks very much!

    The error message reported during the training process is shown below.

    _Training: dimp transformer_dimp No matching checkpoint file found Using /tmp/torch_extensions as PyTorch extensions root...Using /tmp/torch_extensions as PyTorch extensions root...

    Detected CUDA files, patching ldflags Emitting ninja build file /tmp/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Using /tmp/torch_extensions as PyTorch extensions root... No modifications detected for re-loaded extension module _prroi_pooling, skipping build step... Loading extension module _prroi_pooling... ninja: no work to do. Loading extension module _prroi_pooling... Loading extension module _prroi_pooling... Training crashed at epoch 1 Traceback for the error! Traceback (most recent call last): File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/base_trainer.py", line 70, in train self.train_epoch() File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 80, in train_epoch self.cycle_dataset(loader) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset loss, stats = self.actor(data) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/actors/tracking.py", line 97, in call test_proposals=data['test_proposals']) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in replica 0 on device 0. Original Traceback (most recent call last): File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/tracking/dimpnet.py", line 75, in forward iou_pred = self.bb_regressor(train_feat_iou, test_feat_iou, train_bb, test_proposals) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 86, in forward modulation = self.get_modulation(feat1, bb1) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 162, in get_modulation fc3_r = self.fc3_1r(roi3r) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 106, in forward exponential_average_factor, self.eps) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1919, in batch_norm _verify_batch_size(input.size()) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1902, in verify_batch_size raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

    opened by Chenlulu1993 2
Owner
NingWang
PhD student in University of Science and Technology of China (USTC)
NingWang
Code release for "BoxeR: Box-Attention for 2D and 3D Transformers"

BoxeR By Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees Snoek. This repository is an official implementation of the paper BoxeR: Box-A

Nguyen Duy Kien 111 Dec 07, 2022
Highly comparative time-series analysis

〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu

Ben Fulcher 569 Dec 21, 2022
Find-Lane-Line - Use openCV library and Python to detect the road-lane-line

Find-Lane-Line This project is to use openCV library and Python to detect the road-lane-line. Data Pipeline Step one : Color Selection Step two : Cann

Kenny Cheng 3 Aug 17, 2022
Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight) Abstract Due to the limited and even imbalanced dat

Hanzhe Hu 99 Dec 12, 2022
Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

RSCD (BS-RSCD & JCD) Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021) by Zhihang Zhong, Yinqiang Zheng, Imari Sato We co

81 Dec 15, 2022
[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

CTR-GCN This repo is the official implementation for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. The pap

Yuxin Chen 148 Dec 16, 2022
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
A PyTorch library and evaluation platform for end-to-end compression research

CompressAI CompressAI (compress-ay) is a PyTorch library and evaluation platform for end-to-end compression research. CompressAI currently provides: c

InterDigital 680 Jan 06, 2023
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 03, 2023
audioLIME: Listenable Explanations Using Source Separation

audioLIME This repository contains the Python package audioLIME, a tool for creating listenable explanations for machine learning models in music info

Institute of Computational Perception 27 Dec 01, 2022
An offline deep reinforcement learning library

d3rlpy: An offline deep reinforcement learning library d3rlpy is an offline deep reinforcement learning library for practitioners and researchers. imp

Takuma Seno 817 Jan 02, 2023
一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

AgentOCR 简介 AgentOCR 是一个基于 PaddleOCR 和 ONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目 本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】 使用指南 Pytho

AgentMaker 98 Nov 10, 2022
Tensorflow implementation of DeepLabv2

TF-deeplab This is a Tensorflow implementation of DeepLab, compatible with Tensorflow 1.2.1. Currently it supports both training and testing the ResNe

Chenxi Liu 21 Sep 27, 2022
Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

arXiv technical report soon available. we are updating the readme to be as comprehensive as possible Please ask any questions in Issues, thanks. Intro

Zhihan 31 Dec 30, 2022
The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

The Ludii General Game System Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). This repository h

Digital Ludeme Project 50 Jan 04, 2023
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
PyTorch implementation of Pointnet2/Pointnet++

Pointnet2/Pointnet++ PyTorch Project Status: Unmaintained. Due to finite time, I have no plans to update this code and I will not be responding to iss

Erik Wijmans 1.2k Dec 29, 2022
What can linearized neural networks actually say about generalization?

What can linearized neural networks actually say about generalization? This is the source code to reproduce the experiments of the NeurIPS 2021 paper

gortizji 11 Dec 09, 2022
ReSSL: Relational Self-Supervised Learning with Weak Augmentation

ReSSL: Relational Self-Supervised Learning with Weak Augmentation This repository contains PyTorch evaluation code, training code and pretrained model

mingkai 45 Oct 25, 2022
Acute ischemic stroke dataset

AISD Acute ischemic stroke dataset contains 397 Non-Contrast-enhanced CT (NCCT) scans of acute ischemic stroke with the interval from symptom onset to

Kongming Liang 21 Sep 06, 2022