Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

Last update: Dec 22, 2022

Overview

DTI-Sprites

Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper

Check out our paper and webpage for details!

If you find this code useful in your research, please cite:

@article{monnier2021dtisprites,
  title={{Unsupervised Layered Image Decomposition into Object Prototypes}},
  author={Monnier, Tom and Vincent, Elliot and Ponce, Jean and Aubry, Mathieu},
  journal={arXiv},
  year={2021},
}

Installation 👷

1. Create conda environment

conda env create -f environment.yml
conda activate dti-sprites

Optional: some monitoring routines are implemented, you can use them by specifying the visdom port in the config file. You will need to install visdom from source beforehand

git clone https://github.com/facebookresearch/visdom
cd visdom && pip install -e .

2. Download non-torchvision datasets

./download_data.sh

This command will download following datasets:

Tetrominoes, Multi-dSprites and CLEVR6 (link to the original repo multi-object datasets with raw tfrecords)
GTSRB (link to the original dataset page)
Weizmann Horse database (link to the original dataset page)
Instagram collections associated to #santaphoto and #weddingkiss (link to the original repo with datasets links and descriptions)

NB: it may happen that gdown hangs, if so you can download them by hand with following gdrive links, unzip and move them to the datasets folder:

How to use 🚀

1. Launch a training

cuda=gpu_id config=filename.yml tag=run_tag ./pipeline.sh

where:

gpu_id is a target cuda device id,
filename.yml is a YAML config located in configs folder,
run_tag is a tag for the experiment.

Results are saved at runs/${DATASET}/${DATE}_${run_tag} where DATASET is the dataset name specified in filename.yml and DATE is the current date in mmdd format. Some training visual results like sprites evolution and reconstruction examples will be saved. Here is an example from Tetrominoes dataset:

Reconstruction examples

Sprites evolution and final

More visual results are available at https://imagine.enpc.fr/~monniert/DTI-Sprites/extra_results/.

2. Reproduce our quantitative results

To launch 5 runs on Tetrominoes benchmark and reproduce our results:

cuda=gpu_id config=tetro.yml tag=default ./multi_pipeline.sh

Available configs are:

Multi-object benchmarks: tetro.yml, dpsrites_gray.yml, clevr6.yml
Clustering benchmarks: gtsrb8.yml, svhn.yml
Cosegmentation dataset: horse.yml

3. Reproduce our qualitative results on Instagram collections

(skip if already downloaded with script above) Create a santaphoto dataset by running process_insta_santa.sh script. It can take a while to scrape the 10k posts from Instagram.
Launch training with cuda=gpu_id config=instagram.yml tag=santaphoto ./pipeline.sh

That's it!

Top 8 sprites discovered

Decomposition examples

Further information

If you like this project, please check out related works on deep transformations from our group:

Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

Related tags

Overview

DTI-Sprites

Installation 👷

1. Create conda environment

2. Download non-torchvision datasets

How to use 🚀

1. Launch a training

Reconstruction examples

Sprites evolution and final

2. Reproduce our quantitative results

3. Reproduce our qualitative results on Instagram collections

Top 8 sprites discovered

Decomposition examples

Further information

Owner

Integrated physics-based and ligand-based modeling.

[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Research shows Google collects 20x more data from Android than Apple collects from iOS. Block this non-consensual telemetry using pihole blocklists.

A keras-based real-time model for medical image segmentation (CFPNet-M)

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

Open-Ended Commonsense Reasoning (NAACL 2021)

TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently.

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

This repo contains the code for paper Inverse Weighted Survival Games

Flexible-Modal Face Anti-Spoofing: A Benchmark

Code release for "Conditional Adversarial Domain Adaptation" (NIPS 2018)

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

Diverse Branch Block: Building a Convolution as an Inception-like Unit

[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

CVPR2021 Content-Aware GAN Compression

Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

Pytorch Implementation of the paper "Cross-domain Correspondence Learning for Exemplar-based Image Translation"

[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Official Repository of NeurIPS2021 paper: PTR