Revisiting spatio-temporal layouts for compositional action recognition

Codebase for Revisiting spatio-temporal layouts for compositional action recognition.

Dependencies

If you use Poetry, running poetry install inside the project should suffice.

Preparing the data

Something-Something and Something-Else

You need to download the data splits and labels, the annotations, and the video sizes. Make sure that the annotations for the split you want to create datasets for are in a single directory. Then, use create_something_datasets.py to create the training and test datasets as:

python src/create_something_datasets.py --train_data_path "data/path-to-the-train-file.json"
                                        --val_data_path "data/path-to-the-val-file.json"
                                        --annotations_path "data/all-annotations-for-the-split/"

Action-Genome

You need to download the Action Genome data and the Charades data. Then, use create_action_genome_datasets.py to create the training and test datasets as:

python src/create_action_genome_datasets.py --action_genome_path "data/path-to-action-genome"
                                            --charades_path "data/path-to-charades"
                                            --save_datasets_path "data/directory-where-the-data-will-be-saved"

Model Zoo

Trained models currently available for the Something-Else and the Action Genome dataset. If a model is not currently available and you need it, feel free to reach out as we are still in the process of releasing the models (Including Something-Something V2).

Model	Dataset	Download
STLT	Something-Else Compositional Split Detections	Link
LCF	Something-Else Compositional Split Detections	Link
CAF	Something-Else Compositional Split Detections	Link
CACNF	Something-Else Compositional Split Detections	Link
STLT	Action Genome Oracle	Link
STLT	Action Genome Detections	Link

Training and Inference

The codebase currently supports training and inference of STLT, LCF, CAF, CACNF models. Refer to the train.py and the inference.py scripts. Additonally, you need to download the Resnet3D, pretrained on Kinetics and similar from here, and add it in models/. To run inference with a trained model, e.g., STLT on Something-Else Compositional split, you can do the following:

poetry run python src/inference.py --checkpoint_path "models/comp_split_detect_stlt.pt" 
                                   --test_dataset_path "data/something-somethiing/comp_split_detect/val_dataset.json"
                                   --labels_path "data/something-something/comp_split_detect/something-something-v2-labels.json"
                                   --videoid2size_path "data/something-something/videoid2size.json"
                                   --dataset_type "layout"
                                   --model_name "stlt"
                                   --dataset_name "something"

To run inference with a pre-trained CACNF model you can do the following:

poetry run python src/inference.py --checkpoint_path "models/something-something/comp_split_detect_cacnf.pt"                                --test_dataset_path "data/something-something/comp_split_detect/val_dataset.json"                              --labels_path "data/something-something/comp_split_detect/something-something-v2-labels.json"
                                   --videoid2size_path "data/something-something/videoid2size.json" --batch_size 4 --dataset_type "multimodal"
                                   --model_name "cacnf"
                                   --dataset_name "something"
                                   --videos_path "data/something-something/dataset.hdf5"
                                   --resnet_model_path "models/something-something/r3d50_KMS_200ep.pth"

for both examples, make sure to provide your local paths to the dataset files and the pre-trained checkpoints.

Citation

If you find our code useful for your own research please use the following BibTeX entry.

@article{radevski2021revisiting,
  title={Revisiting spatio-temporal layouts for compositional action recognition},
  author={Radevski, Gorjan and Moens, Marie-Francine and Tuytelaars, Tinne},
  journal={arXiv preprint arXiv:2111.01936},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
models		models
notebooks		notebooks
src		src
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

notebooks

notebooks

src

src

.flake8

.flake8

.gitignore

.gitignore

README.md

README.md

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

Repository files navigation

Revisiting spatio-temporal layouts for compositional action recognition

Dependencies

Preparing the data

Something-Something and Something-Else

Action-Genome

Model Zoo

Training and Inference

Citation

About

Releases

Packages

Languages

gorjanradevski/revisiting-spatial-temporal-layouts

Folders and files

Latest commit

History

Repository files navigation

Revisiting spatio-temporal layouts for compositional action recognition

Dependencies

Preparing the data

Something-Something and Something-Else

Action-Genome

Model Zoo

Training and Inference

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages