[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Last update: Jan 04, 2023

Related tags

Deep Learning energy-based-model

Overview

Learning to Compose Visual Relations

This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations.

Demo

Image Generation Demo

Please use the following command to generate images on the CLEVR dataset. Please use --num_rels to control the input relational descriptions.

python demo.py --checkpoint_folder ./checkpoint --model_name clevr --output_folder ./ --dataset clevr \
--resume_iter best --batch_size 25 --num_steps 80 --num_rels 1 --data_folder ./data --mode generation

GIF	Final Generated Image

Image Editing Demo

Please use the following command to edit images on the CLEVR dataset. Please use --num_rels to control the input relational descriptions.

python demo.py --checkpoint_folder ./checkpoint --model_name clevr --output_folder ./ --dataset clevr \
--resume_iter best --batch_size 25 --num_steps 80 --num_rels 1 --data_folder ./data --mode editing

Input Image	GIF	Final Edited Image

Training

Data Preparation

Please utilize the following data link to download the CLEVR data utilized in our experiments. Then place all data files under ./data folder. Downloads for additional datasets and precomputed feature files will be posted soon. Feel free to raise an issue if there is a particular dataset you would like to download.

Model Training

To train your own model, please run following command. Please use --dataset to train your model on different datasets, e.g. --dataset clevr.

python -u train.py --cond --dataset=${dataset} --exp=${dataset} --batch_size=10 --step_lr=300 \
--num_steps=60 --kl --gpus=1 --nodes=1 --filter_dim=128 --im_size=128 --self_attn \
--multiscale --norm --spec_norm --slurm --lr=1e-4 --cuda --replay_batch \
--numpy_data_path ./data/clevr_training_data.npz

Evaluation

To evaluate our model, you can use your own trained models or download the pre-trained models model_best.pth from ${dataset}_model folder from link and put it under the project folder ./checkpoints/${dataset}. Only clevr_model is currently available. More pretrained-models will be posted soon.

Evaluate Image Generation Results Using the Pretrained Classifiers

Please use the following command to generate images on the test set first. Please use --dataset and --num_rels to control the dataset and the number of input relational descriptions. Note that 1 <= num_rels <= 3.

python inference.py --checkpoint_folder ./checkpoints --model_name ${dataset} \
--output_folder ./${dataset}_gen_images --dataset ${dataset} --resume_iter best \
--batch_size 32 --num_steps 80 --num_rels ${num_rels} --data_folder ./data --mode generation

In order to evaluate the binary classification scores of the generated images, you can train one binary classifier or download a pretrained one from link under the binary_classifier folder.

To train your own binary classifier, please use following command:

python train_classifier.py --train --spec_norm --norm \
--dataset ${dataset} --lr 3e-4 --checkpoint_dir ./binary_classifier

Please use following command to evaluate on generated images conditioned on selected number of relations. Please use --num_rels to specify the number of relations.

python classification_scores.py --dataset ${dataset} --checkpoint_dir ./binary_classifier/ \
--data_folder ./data --generated_img_folder ./${dataset}_gen_images/num_rels_${num_rels} \
--mode generation --num_rels ${num_rels}

Evaluate Image Editing Results Using the Pretrained Classifiers

Please use the following command to edit images on the test set first. Please use --dataset and --num_rels to select the dataset and the number of input relational descriptions.

python inference.py --checkpoint_folder ./checkpoints --model_name ${dataset} \
--output_folder ./${dataset}_edit_images --dataset ${dataset} --resume_iter best \
--batch_size 32 --num_steps 80 --num_rels 1 --data_folder ./data --mode editing

To evaluate classification scores of image editing results, please change the --mode to editing.

python classification_scores.py --dataset ${dataset} --checkpoint_dir ./binary_classifier/ \
--data_folder ./data --generated_img_folder ./${dataset}_edit_images/num_rels_${num_rels} \
--mode editing --num_rels ${num_rels}

Acknowledgements

The code for training EBMs is from https://github.com/yilundu/improved_contrastive_divergence.

Citation

Please consider citing our papers if you use this code in your research:

@article{liu2021learning,
  title={Learning to Compose Visual Relations},
  author={Liu, Nan and Li, Shuang and Du, Yilun and Tenenbaum, Josh and Torralba, Antonio},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Related tags

Overview

Learning to Compose Visual Relations

Demo

Image Generation Demo

Image Editing Demo

Training

Data Preparation

Model Training

Evaluation

Evaluate Image Generation Results Using the Pretrained Classifiers

Evaluate Image Editing Results Using the Pretrained Classifiers

Acknowledgements

Citation

Owner

Nan Liu

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

A general python framework for visual object tracking and video object segmentation, based on PyTorch

Network Compression via Central Filter

Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

PyTorch implementation of Higher Order Recurrent Space-Time Transformer

Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

Compact Bilinear Pooling for PyTorch

Puzzle-CAM: Improved localization via matching partial and full features.

QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

Hack Camera, Microphone, Location, Clipboard With Just a Link. Also, Get Many Details About Victim's Device. And So On...

thundernet ncnn

A GOOD REPRESENTATION DETECTS NOISY LABELS

Auto White-Balance Correction for Mixed-Illuminant Scenes

School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

Have you ever wondered how cool it would be to have your own A.I

Dual Attention Network for Scene Segmentation (CVPR2019)