Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.

Last update: Dec 30, 2022

Related tags

Deep Learning mint

Overview

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ [ICCV-2021].

Overview

This package contains the model implementation and training infrastructure of our AI Choreographer.

Get started

Pull the code

git clone https://github.com/liruilong940607/mint --recursive

Note here --recursive is important as it will automatically clone the submodule (orbit) as well.

Install dependencies

conda create -n mint python=3.7
conda activate mint
conda install protobuf numpy
pip install tensorflow absl-py tensorflow-datasets librosa

sudo apt-get install libopenexr-dev
pip install --upgrade OpenEXR
pip install tensorflow-graphics tensorflow-graphics-gpu

git clone https://github.com/arogozhnikov/einops /tmp/einops
cd /tmp/einops/ && pip install . -U

git clone https://github.com/google/aistplusplus_api /tmp/aistplusplus_api
cd /tmp/aistplusplus_api && pip install -r requirements.txt && pip install . -U

Note if you meet environment conflicts about numpy, you can try with pip install numpy==1.20.

Get the data

See the website

Get the checkpoint

Download from google drive here, and put them to the folder ./checkpoints/

Run the code

complie protocols

protoc ./mint/protos/*.proto

preprocess dataset into tfrecord

python tools/preprocessing.py \
    --anno_dir="/mnt/data/aist_plusplus_final/" \
    --audio_dir="/mnt/data/AIST/music/" \
    --split=train
python tools/preprocessing.py \
    --anno_dir="/mnt/data/aist_plusplus_final/" \
    --audio_dir="/mnt/data/AIST/music/" \
    --split=testval

run training

python trainer.py --config_path ./configs/fact_v5_deeper_t10_cm12.config --model_dir ./checkpoints

Note you might want to change the batch_size in the config file if you meet OUT-OF-MEMORY issue.

run testing and evaluation

# caching the generated motions (seed included) to `./outputs`
python evaluator.py --config_path ./configs/fact_v5_deeper_t10_cm12.config --model_dir ./checkpoints
# calculate FIDs
python tools/calculate_scores.py

Citation

@inproceedings{li2021dance,
  title={AI Choreographer: Music Conditioned 3D Dance Generation with AIST++},
  author={Ruilong Li and Shan Yang and David A. Ross and Angjoo Kanazawa},
  booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
  year = {2021}
}

Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.

Related tags

Overview

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ [ICCV-2021].

Overview

Get started

Pull the code

Install dependencies

Get the data

Get the checkpoint

Run the code

Citation

Owner

Google Research

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021)

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Keep CALM and Improve Visual Feature Attribution

1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Codes accompanying the paper "Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning" (NeurIPS 2021 Spotlight

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

The final project of "Applying AI to 2D Medical Imaging Data" of "AI for Healthcare" nanodegree - Udacity.

Code for "On Memorization in Probabilistic Deep Generative Models"

Simulation-based inference for the Galactic Center Excess

Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

Reporting and Visualization for Hazardous Events

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

OSLO: Open Source framework for Large-scale transformer Optimization

ML for NLP and Computer Vision.

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Code for the paper Learning the Predictability of the Future