Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

Last update: Dec 18, 2022

Related tags

Deep Learning SiamSA

Overview

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator

Demo video

📹 Our video on Youtube and bilibili demonstrates the evaluation of SiamSA and other 4 state-of-the-art trackers on [email protected] and UAMT100 benchmark.

📹 Real-world tests of SiamSA on a flying UAM platform form first and third perspective are also involved.

UAMT100 benchmark

The UAMT100 benchmark consists of 100 image sequences, which are captured from UAM perspectives. For subsequent tasks of UAM tracking, such as grasping, it represents various possibilities of UAM's tracking the object in an indoor environment.

16 kinds of objects are involved, and 11 attributes are annotated for each sequence. The figure demonstrates four scenarios of UAM tracking in UAMT100. The histogram in the figure is a statistic of attributes in UAMT100.
For more detail, please refer to the benchmark website, which will be released soon.

Environment setup

This code has been tested on Ubuntu 18.04, Python 3.8.3, Pytorch 0.7.0/1.6.0, CUDA 10.2. Please install related libraries before running this code:

pip install -r requirements.txt

Test

Download model from Google Drive or BaiduYun (code: v4r0) and put it into tools/snapshot directory.

Download testing datasets and put them into test_dataset directory. If you want to test the tracker on a new dataset, please refer to pysot-toolkit to set test_dataset.

python test.py 	                    \
	--trackername SiamSA            \ # tracker_name
	--dataset UAV123_10fps          \ # dataset_name
	--snapshot snapshot/model.pth     # model_path

The testing result will be saved in the results/dataset_name/tracker_name directory.

We provide our test results on Google Drive and BaiduYun (code: v4r1).

Train

Prepare training datasets

Download the datasets：

VID
YOUTUBEBB (code: t7j8)
COCO
GOT-10K

Note: train_dataset/dataset_name/readme.md has listed detailed operations about how to generate training datasets.

Train a model

To train the SiamSA model, run train.py with the desired configs:

cd tools
python train.py

Evaluation

If you want to evaluate the tracker mentioned above, please put those results into results directory.

python eval.py 	                      \
	--tracker_path ./results          \ # result path
	--dataset UAV123_10fps            \ # dataset_name
	--tracker_prefix 'model'            # tracker_name

Contact

If you have any questions, please contact me.

Guangze Zheng

Email: [email protected]

Acknowledgement

The code is implemented based on pysot and SiamAPN. We would like to express our sincere thanks to the contributors.
Besides, we would like to thank Ziang Cao for his advice on the code.
As for UAMT100 benchmark, we appreciate the help from Fuling Lin, Haobo Zuo, and Liangliang Yao.
We would like to thank Kunhan Lu for his advice on TensorRT acceleration.

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

Related tags

Overview

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator

Demo video

UAMT100 benchmark

Environment setup

Test

Train

Prepare training datasets

Train a model

Evaluation

Contact

Acknowledgement

Owner

Intelligent Vision for Robotics in Complex Environment

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

A Python package for faster, safer, and simpler ML processes

Attention mechanism with MNIST dataset

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Pytorch implementation of U-Net, R2U-Net, Attention U-Net, and Attention R2U-Net.

CIFAR-10 Photo Classification

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

CMSC320 - Introduction to Data Science - Fall 2021

ETMO: Evolutionary Transfer Multiobjective Optimization

Model of an AI powered sign language interpreter.

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

DRIFT is a tool for Diachronic Analysis of Scientific Literature.

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation