Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

Related tags

Deep LearningTGraM
Overview

TGraM

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling,
Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu

Abstract

Recently, satellite video has become an emerging means of earth observation, providing the possibility of tracking moving objects. However, the existing multi-object trackers are commonly designed for natural scenes without considering the characteristics of remotely sensed data. In addition, most trackers are composed of two independent stages of detection and re-identification (ReID), which means that they cannot be mutually promoted. To this end, we propose an end-to-end online framework, which is called TGraM, for multi-object tracking in satellite videos. It models multi-object tracking as a graph information reasoning procedure from the multi-task learning perspective. Specifically, a graph-based spatiotemporal reasoning module is presented to mine the potential high-order correlations between video frames. Furthermore, considering the inconsistency of optimization objectives between detection and ReID, a multi-task gradient adversarial learning strategy is designed to regularize each task-specific network. Additionally, aiming at the data scarcity in this field, a large-scale and high-resolution Jilin1 satellite video dataset for multi-object tracking (AIR-MOT) is built for the experiments. Compared with state-of-the-art multi-object trackers, TGraM achieves efficient collaborative learning between detection and ReID, improving the tracking accuracy by 1.2 MOTA.

Paper

Please cite our paper if you find the code or dataset useful for your research.

@ARTICLE{He-TGRS-TGraM-2022,
  author={Q. {He} and X. {Sun} and Z. {Yan} and B. {Li} and K. {Fu}},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling}, 
  year={2022},
  volume={},
  number={},
  pages={1-14},
  doi={}}

Installation

  • Clone this repo, and we'll call the directory that you cloned as ${TGRAM_ROOT}
  • Install dependencies. We use python 3.7 and pytorch >= 1.2.0
conda create -n TGraM
conda activate TGraM
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
cd ${TGRAM_ROOT}
pip install -r requirements.txt
  • We use DCNv2 in our backbone network and more details can be found in their repo.
git clone https://github.com/CharlesShang/DCNv2
cd DCNv2
./make.sh
  • In order to run the code for demos, you also need to install ffmpeg.

Data preparation

AIR-MOT
   |——————images
   |        └——————train
   |        └——————test
   └——————labels_with_ids
            └——————train(empty)

Then, you can change the seq_root and label_root in src/gen_labels_airmot.py and run:

cd src
python gen_labels_airmot.py

to generate the labels of AIR-MOT.

Training

  • Download the training data
  • Change the dataset root directory 'root' in src/lib/cfg/data.json and 'data_dir' in src/lib/opts.py
  • Train on AIR-MOT:
sh experiments/airmot.sh

Tracking

  • The default settings run tracking on the testing dataset from AIR-MOT. Using the trained model, you can run:
cd src
CUDA_VISIBLE_DEVICES=0 python track_half_air.py mot --load_model ../exp/airmot/210529_airmot_tgrammbseg/model_last.pth --conf_thres 0.4 --val_mot17 True --gpus 5 --data_dir '/workspace/tgram/src/data/' --arch tgrammbseg  --num_frames 3 --num_workers 2 --output_dir '/workspace/tgram/result/' --save_images --down_ratio 4 --exp_name 210526_tgrammbseg_cam

to obtain the tracking results. You can also set save_images=True in src/track.py to save the visualization results of each frame.

Train on custom dataset

You can train TGraM on custom dataset by following several steps bellow:

  1. Generate one txt label file for one image. Each line of the txt label file represents one object. The format of the line is: "class id x_center/img_width y_center/img_height w/img_width h/img_height". You can modify src/gen_labels_16.py to generate label files for your custom dataset.
  2. Generate files containing image paths. The example files are in src/data/. Some similar code can be found in src/gen_labels_crowd.py
  3. Create a json file for your custom dataset in src/lib/cfg/. You need to specify the "root" and "train" keys in the json file. You can find some examples in src/lib/cfg/.
  4. Add --data_cfg '../src/lib/cfg/your_dataset.json' when training.

Acknowledgement

A large part of the code is borrowed from Zhongdao/Towards-Realtime-MOT and xingyizhou/CenterNet. Thanks for their wonderful works.

Owner
Qibin He
Qibin He
CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

CSAC Introduction This repository contains the implementation code for paper: Co

ScottYuan 5 Jul 22, 2022
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result

Omar Laham 1 Jan 14, 2022
The Official PyTorch Implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 spotlight paper)

Official PyTorch implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 Spotlight Paper) Zhisheng

NVIDIA Research Projects 45 Dec 26, 2022
Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

LARGE: Latent-Based Regression through GAN Semantics [Project Website] [Google Colab] [Paper] LARGE: Latent-Based Regression through GAN Semantics Yot

83 Dec 06, 2022
Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Hierarchical Skills for Efficient Exploration This is the source code release for the paper Hierarchical Skills for Efficient Exploration. It contains

Facebook Research 38 Dec 06, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dea

MIC-DKFZ 1.2k Jan 04, 2023
A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our rep

7.7k Jan 06, 2023
PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

Chen Kai 24 Dec 05, 2022
Learning from Synthetic Humans, CVPR 2017

Learning from Synthetic Humans (SURREAL) Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev and Cordelia Schmid,

Gul Varol 538 Dec 18, 2022
GBIM(Gesture-Based Interaction map)

手势交互地图 GBIM(Gesture-Based Interaction map),基于视觉深度神经网络的交互地图,通过电脑摄像头观察使用者的手势变化,进而控制地图进行简单的交互。网络使用PaddleX提供的轻量级模型PPYOLO Tiny以及MobileNet V3 small,使得整个模型大小约10MB左右,即使在CPU下也能快速定位和识别手势。

8 Feb 10, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 203 Dec 31, 2022
Implementation of ICLR 2020 paper "Revisiting Self-Training for Neural Sequence Generation"

Self-Training for Neural Sequence Generation This repo includes instructions for running noisy self-training algorithms from the following paper: Revi

Junxian He 45 Dec 31, 2022
Library extending Jupyter notebooks to integrate with Apache TinkerPop and RDF SPARQL.

Graph Notebook: easily query and visualize graphs The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Us

Amazon Web Services 501 Dec 28, 2022
Equivariant Imaging: Learning Beyond the Range Space

[Project] Equivariant Imaging: Learning Beyond the Range Space Project about the

Georges Le Bellier 3 Feb 06, 2022
NeurIPS 2021 paper 'Representation Learning on Spatial Networks' code

Representation Learning on Spatial Networks This repository is the official implementation of Representation Learning on Spatial Networks. Training Ex

13 Dec 29, 2022
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation [Project] [Paper] [arXiv] [Home] Official implementation of FastFCN:

Wu Huikai 815 Dec 29, 2022
An index of recommendation algorithms that are based on Graph Neural Networks.

An index of recommendation algorithms that are based on Graph Neural Networks.

FIB LAB, Tsinghua University 564 Jan 07, 2023
Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling Code for the paper: Greg Ver Steeg and Aram Galstyan. "Hamiltonian Dynamics with N

Greg Ver Steeg 25 Mar 14, 2022