🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Last update: Dec 29, 2022

Related tags

Deep Learning AIC2021-T5-CLV

Overview

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

We have two codebases. For the final submission, we conduct the feature ensemble, where features are from two codebases.

Part One is at here: https://github.com/ShuaiBai623/AIC2021-T5-CLV

Part Two is at here: https://github.com/layumi/NLP-AICity2021

Prepare

Preprocess the dataset to prepare frames, motion maps, NLP augmentation

scripts/extract_vdo_frms.py is a Python script that is used to extract frames.

scripts/get_motion_maps.py is a Python script that is used to get motion maps.

scripts/deal_nlpaug.py is a Python script that is used for NLP augmentation.

Download the pretrained models of Part One to checkpoints. The checkpoints can be found here. The best score of a single model on TestA is 0.1927 from motion_effb3_NOCLS_nlpaug_320.pth.

The directory structures in data and checkpoints are as follows：

.
├── checkpoints
│   ├── motion_effb2_1CLS_nlpaug_288.pth
│   ├── motion_effb3_NOCLS_nlpaug_320.pth
│   ├── motion_SE_3CLS_nonlpaug_288.pth
│   ├── motion_SE_NOCLS_nlpaug_288.pth
│   └── motion_SE_NOCLS_nonlpaug_288.pth
└── data
    ├── AIC21_Track5_NL_Retrieval
    │   ├── train
    │   └── validation
    ├── motion_map 
    ├── test-queries.json
    ├── test-queries_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
    ├── test-tracks.json
    ├── train.json
    ├── train_nlpaug.json
    ├── train-tracks.json
    ├── train-tracks_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
    ├── val.json
    └── val_nlpaug.json             ## NLP augmentation (Refer to scripts/deal_nlpaug.py)

Part One

Modify the data paths in config.py

Train

The configuration files are in configs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main.py --name your_experiment_name --config your_config_file |tee log

Test

Change the RESTORE_FROM in your configuration file.

python -u test.py --config your_config_file

Extract the visual and text embeddings. The extracted embeddings can be found here.

python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nlpaug_288.yaml
python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_3CLS_nonlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nonlpaug_288.yaml

Part Two

Link

Submission

During the inference, we average all the frame features of the target in each track as track features, the embeddings of text descriptions are also averaged as the query features. The cosine distance is used for ranking as the final result.

Reproduce the best submission. ALL extracted embeddings are in the folder output:

python scripts/get_submit.py

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Related tags

Overview

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

Prepare

Part One

Train

Test

Part Two

Submission

Friend Links：

Owner

Joint Unsupervised Learning (JULE) of Deep Representations and Image Clusters.

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

Task-based end-to-end model learning in stochastic optimization

PyTorch Implementation of Backbone of PicoDet

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Learning to Reach Goals via Iterated Supervised Learning

Machine learning algorithms for many-body quantum systems

The code of NeurIPS 2021 paper "Scalable Rule-Based Representation Learning for Interpretable Classification".

This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

LEAP: Learning Articulated Occupancy of People

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction"

Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

It helps user to learn Pick-up lines and share if he has a better one

Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Datasets"

HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite