Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge

Last update: Dec 21, 2022

Related tags

Admin Panels VidVRD-tracklets

Overview

VidVRD-tracklets

This repository contains codes for Video Visual Relation Detection (VidVRD) tracklets generation based on MEGA and deepSORT. These tracklets are also suitable for ACM MM Visual Relation Understanding (VRU) Grand Challenge (which is base on the VidOR dataset).

If you are only interested in the generated tracklets, you can ignore these codes and download them directly from here

Download generated tracklets directly

We release the object tracklets for VidOR train/validation/test set. You can download the tracklets here, and put them in the following folder as

├── deepSORT
│   ├── ...
│   ├── tracking_results
│   │   ├── VidORtrain_freq1_m60s0.3_part01
│   │   ├── ...
│   │   ├── VidORtrain_freq1_m60s0.3_part14
│   │   ├── VidORval_freq1_m60s0.3
│   │   ├── VidORtest_freq1_m60s0.3
│   │   ├── readme.md
│   │   └── format_demo.py
│   └── ...
├── MEGA
│   ├── ... 
│   └── ...

Please refer to deepSORT/tracking_results/readme.md for more details

Evaluate the tracklets mAP

Run python deepSORT/eval_traj_mAP.py to evaluate the tracklets mAP. (you might need to change some args in deepSORT/eval_traj_mAP.py)

Generate object tracklets by yourself

The object tracklets generation pipeline mainly consists of two parts: MEGA (for video object detection), and deepSORT (for video object tracking).

Quick Start

Install MEGA as the official instructions MEGA/INSTALL.md (Note that the folder path may be different when installing).
- If you have any trouble when installing MEGA, you can try to clone the official MEGA repository and install it, and then replace the official mega.pytorch/mega_core with our modified MEGA/mega_core. Refer to MEGA/modification_details.md for the details of our modifications.
Download the VidOR dataset and the pre-trained weight of MEGA. Put them in the following folder as

├── deepSORT/
│   ├── ...
├── MEGA/
│   ├── ... 
│   ├── datasets/
│   │   ├── COCOdataset/        # used for MEGA training
│   │   ├── COCOinVidOR/        # used for MEGA training
│   │   ├── vidor-dataset/
│   │   │   ├── annotation/
│   │   │   │   ├── training/
│   │   │   │   └── validation/
│   │   │   ├── img_index/ 
│   │   │   │   ├── VidORval_freq1_0024.txt
│   │   │   │   ├── ...
│   │   │   ├── val_frames/
│   │   │   │   ├── 0001_2793806282/
│   │   │   │   │   ├── 000000.JPEG
│   │   │   │   │   ├── ...
│   │   │   │   ├── ...
│   │   │   ├── val_videos/
│   │   │   │   ├── 0001/
│   │   │   │   │   ├── 2793806282.mp4
│   │   │   │   │   ├── ...
│   │   │   │   ├── ...
│   │   │   ├── train_frames/
│   │   │   ├── train_videos/
│   │   │   ├── test_frames/
│   │   │   ├── test_videos/
│   │   │   └── video2img_vidor.py
│   │   └── construct_img_idx.py
│   ├── training_dir/
│   │   ├── COCO34ORfreq32_4gpu/
│   │   │   ├── inference/
│   │   │   │   ├── VidORval_freq1_0024/
│   │   │   │   │   ├── predictions.pth
│   │   │   │   │   └── result.txt
│   │   │   │   ├── ...
│   │   │   └── model_0180000.pth
│   │   ├── ...

Run python MEGA/datasets/vidor-dataset/video2img_vidor.py (note that you may need to change some args) to extract frames from videos (This causes a lot of data redundancy, but we have to do this, because MEGA takes image data as input).
Run python MEGA/datasets/construct_img_idx.py (note that you may need to change some args) to generate the img_index used in MEGA inference.
- The generated .txt files will be saved in MEGA/datasets/vidor-dataset/img_index/. You can use VidORval_freq1_0024.txt as a demo for the following commands.
Run the following command to detect frame-level object proposals with bbox features (RoI pooled features).
```
CUDA_VISIBLE_DEVICES=0   python  \
    MEGA/tools/test_net.py \
    --config-file MEGA/configs/MEGA/inference/VidORval_freq1_0024.yaml \
    MODEL.WEIGHT MEGA/training_dir/COCO34ORfreq32_4gpu/model_0180000.pth \
    OUTPUT_DIR MEGA/training_dir/COCO34ORfreq32_4gpu/inference
```
- The above command will generate a predictions.pth file for this VidORval_freq1_0024 demo. We also release this predictions.pth here.
- the config files for VidOR train set are in MEGA/configs/MEGA/partxx
- The predictions.pth contains frame-level box positions and features (RoI features) for each object. For RoI features, they can be accessed through roifeats = boxlist.get_field("roi_feats"), if you are familiar with MEGA or maskrcnn-benchmark
Run python MEGA/mega_boxfeatures/cvt_proposal_result.py (note that you may need to change some args) to convert predictions.pth to a .pkl file for the following deepSORT stage.
- We also provide VidORval_freq1_0024.pkl here
Run python deepSORT/deepSORT_tracking_v2.py (note that you may need to change some args) to perform deepSORT tracking. The results will be saved in deepSORT/tracking_results/

Train MEGA for VidOR by yourself

Download MS-COCO and put them as shown in above.
Run python MEGA/tools/extract_coco.py to extract annotations for COCO in VidOR, which results in COCO_train_34classes.pkl and COCO_valmini_34classes.pkl
train MEGA by the following commands:

    python -m torch.distributed.launch \
        --nproc_per_node=4 \
        tools/train_net.py \
        --master_port=$((RANDOM + 10000)) \
        --config-file MEGA/configs/MEGA/vidor_R_101_C4_MEGA_1x_4gpu.yaml \
        OUTPUT_DIR MEGA/training_dir/COCO34ORfreq32_4gpu

More detailed training instructions will be updated soon...

Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge

Related tags

Overview

VidVRD-tracklets

Download generated tracklets directly

Evaluate the tracklets mAP

Generate object tracklets by yourself

Quick Start

Train MEGA for VidOR by yourself

Owner

FastAPI Admin Dashboard based on FastAPI and Tortoise ORM.

PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)

Manuskript is an open-source tool for writers.

Sandwich Batch Normalization

Extendable, adaptable rewrite of django.contrib.admin

PyMMO is a Python-based MMO game framework using sockets and PyGame.

A jazzy skin for the Django Admin-Interface (official repository).

Jazzy theme for Django

EOD (Easy and Efficient Object Detection) is a general object detection model production framework.

Modern responsive template for the Django admin interface with improved functionality. We are proud to announce completely new Jet. Please check out Live Demo

"Log in as user" for the Django admin.

:honey_pot: A fake Django admin login screen page.

spider-admin-pro

Allow foreign key attributes in list_display with '__'

A new style for Django admin

WordPress look and feel for Django administration panel

A Django admin theme using Twitter Bootstrap. It doesn't need any kind of modification on your side, just add it to the installed apps.

A user-friendly JSON editing form for Django admin

Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge

Passhunt is a simple tool for searching of default credentials for network devices, web applications and more. Search through 523 vendors and their 2084 default passwords.