PyTorch implementation of MulMON

Overview

MulMON

This repository contains a PyTorch implementation of the paper:
Learning Object-Centric Representations of Multi-object Scenes from Multiple Views

Li Nanbo, Cian Eastwood, Robert B. Fisher
NeurIPS 2020 (Spotlight)

Working examples

Check our video presentation for more: https://youtu.be/Og2ic2L77Pw.

Requirements

Hardware:

  • GPU. Currently, at least one GPU device is required to run this code, however, we will consider adding CPU demo code in the future.
  • Disk space: we do NOT have any hard requirement for the disk space, this is totally data-dependent. To use all the datasets we provide, you will need ~9GB disk space. However, it is not necessary to use all of our datasets (or even our datasets), see Data section for more details.

Python Environement:

  1. We use Anaconda to manage our python environment. Check conda installation guide here: https://docs.anaconda.com/anaconda/install/linux/.

  2. Open a new terminal, direct to the MulMON directory:

cd <YOUR-PATH-TO-MulMON>/MulMON/

create a new conda environment called "mulmon" and then activate it:

conda env create -f ./conda-env-spec.yml  
conda activate mulmon
  1. Install a gpu-supported PyTorch (tested with PyTorch 1.1, 1.2 and 1.7). It is very likely that there exists a PyTorch installer that is compatible with both your CUDA and this code. Go find it on PyTorch official site, and install it with one line of command.

  2. Install additional packages:

pip install tensorboard  
pip install scikit-image

If pytorch <=1.2 is used, you will also need to execute: pip install tensorboardX and import it in the ./trainer/base_trainer.py file. This can be done by commenting the 4th line AND uncommenting the 5th line of that file.

Data

  • Data structure (important):
    We use a data structure as follows:

    <YOUR-PATH>                                          
        ├── ...
        └── mulmon_datasets
              ├── clevr                                   # place your own CLEVR-MV under this directory if you go the fun way
              │    ├── ...
              │    ├── clevr_mv            
              │    │    └── ... (omit)                    # see clevr_<xxx> for subdirectory details
              │    ├── clevr_aug           
              │    │    └── ... (omit)                    # see clevr_<xxx> for subdirectory details
              │    └── clevr_<xxx>
              │         ├── ...
              │         ├── data                          # contains a list of scene files
              │         │    ├── CLEVR_new_#.npy          # one .npy --> one scene sample
              │         │    ├── CLEVR_new_#.npy       
              │         │    └── ...
              │         ├── clevr_<xxx>_train.json        # meta information of the training scenes
              │         └── clevr_<xxx>_test.json         # meta information of the testing scenes  
              └── GQN  
                   ├── ...
                   └── gqn-jaco                 
                        ├── gqn_jaco_train.h5
                        └── gqn_jaco_test.h5
    

    We recommend one to get the necessary data folders ready before downloading/generating the data files:

    mkdir <YOUR-PATH>/mulmon_datasets  
    mkdir <YOUR-PATH>/mulmon_datasets/clevr  
    mkdir <YOUR-PATH>/mulmon_datasets/GQN
    
  • Get Datasets

    • Easy way:
      Download our datasets:

      • clevr_mv.tar.gz and place it under the <YOUR-PATH>/mulmon_datasets/clevr/ directory (~1.8GB when extracted).
      • clevr_aug.tar.gz and place it under the <YOUR-PATH>/mulmon_datasets/clevr/ directory (~3.8GB when extracted).
      • gqn_jaco.tar.gz and place it under the <YOUR-PATH>/mulmon_datasets/GQN/ directory (~3.2GB when extracted).

      and extract them in places. For example, the command for extracting clevr_mv.tar.gz:

      tar -zxvf <YOUR-PATH>/mulmon_datasets/clevr/clevr_mv.tar.gz -C <YOUR-PATH>/mulmon_datasets/clevr/
      

      Note that: 1) we used only a subset of the DeepMind GQN-Jaco dataset, more available at deepmind/gqn-datasets, and 2) the published clevr_aug dataset differs slightly from the CLE-Aug used in the paper---we added more shapes (such as dolphins) into the dataset to make the dataset more interesting (also more complex).

    • Fun way :
      Customise your own multi-view CLEVR data. (available soon...)

Pre-trained models

Download the pretrained models (← click) and place it under `MulMON/', i.e. the root directory of this repository, then extract it by executing: tar -zxvf ./logs.tar.gz. Note that some of them are slightly under-trained, so one could train them further to achieve better results (How to train?).

Usage

Configure data path
To run the code, the data path, i.e. the <YOUR-PATH> in a script, needs to be correctly configured. For example, we store the MulMON dataset folder mulmon_datasets in ../myDatasets/, to train a MulMON on GQN-Jaco dataset using a single GPU, the 4th line of the ./scripts/train_jaco.sh script should look like: data_path=../myDatasets/mulmon_datasets/GQN.

  • Demo (Environment Test)
    Before running the below code, make sure the pretrained models are downloaded and saved first:

    . scripts/demo.sh  
    

    Check ./logs folder for the generated demos.

    • Notes for disentanglement demos: we randomly pick one object for each scene to create the disentanglement demo, so for scene samples where an empty object slot is picked, you won't see any object manipulation effect in the corresponding gifs (especially for the GQN-Jaco scenes). To create a demo like the shown one, one needs to specify (hard-coding) an object slot of interest and traverse informative latent dimensions (as some dimensions are redundant---capture no object property).
  • Train

    • On a single gpu (e.g. using the GQN-Jaco dataset):
    . scripts/train_jaco.sh  
    
    • On multiple GPUs (e.g. using the GQN-Jaco dataset):
    . scripts/train_jaco_parallel.sh  
    
    • To resume training from a stopped session, i.e. saved weights checkpoint-epoch<#number>.pth, simply append a flag --resume_epoch <#number> to one of the flags in the script files.
      For example, to resume previous training (saved as checkpoint-epoch2000.pth) on GQN-Jaco data, we just need to reconfigure the 10th line of the ./scripts/train_jaco.sh as:
      --input_dir ${data_path} --output_dir ${log_path} --resume_epoch 2000 \.
  • Evaluation

    • On a single gpu (e.g. using the Clevr_MV dataset):
    . scripts/eval_clevr.sh  
    
    • Here is a list of imporant evaluation settings which one might wants to play with
      --resume_epoch specify a model to evaluate --test_batch how many batches of test data one uses for evaluation.
      --vis_batch how many batches of output one visualises (save) while evaluation. (note: <= --test_batch)
      --analyse_batch how many batches of latent codes one saves for a post analysis, e.g. disentanglement. (note: <= --test_batch)
      --eval_all (boolean) set True for all [--eval_recon, --eval_seg, --eval_qry_obs, --eval_qry_seg] items, one could also use each of the four independently.
      --eval_dist (boolean) save latent codes for disentanglement analysis. (note: not controlled by --eval_all)
    • For the disentanglement evaluation, run the scripts/eval_clevr.sh script with --eval_dist flag set to True and set the --analyse_batch variable (which controls how many scenes of latent codes one wants to analyse) to be greater than 0. This saves the ouptut latent codes and ground-truth information that allows you to conduct disentanglement quantification using the QEDR framework.
    • You might observe that the evaluation results on the CLE-Aug dataset differ form those on the original paper, this is because the CLE-Aug here is slightly different the one we used for the paper (see more details).

Contact

We constantly respond to the raised ''issues'' in terms of running the code. For further inquiries and discussions (e.g. questions about the paper), email: [email protected].

Cite

Please cite our paper if you find this code useful.

@inproceedings{nanbo2020mulmon,
  title={Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views},
  author={Nanbo, Li and Eastwood, Cian and Fisher, Robert B},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}
Owner
NanboLi
PhD Student, University of Edinburgh
NanboLi
RID-Noise: Towards Robust Inverse Design under Noisy Environments

This is code of RID-Noise. Reproduce RID-Noise Results Toy tasks Please refer to the notebook ridnoise.ipynb to view experiments on three toy tasks. B

Thyrix 2 Nov 23, 2022
ECAENet (TensorFlow and Keras)

ECAENet: EfficientNet with Efficient Channel Attention for Plant Species Recognition (SCI:Q3) (Journal of Intelligent & Fuzzy Systems)

4 Dec 22, 2022
Run object detection model on the Raspberry Pi

Using TensorFlow Lite with Python is great for embedded devices based on Linux, such as Raspberry Pi.

Dimitri Yanovsky 6 Oct 08, 2022
Exemplo de implementação do padrão circuit breaker em python

fast-circuit-breaker Circuit breakers existem para permitir que uma parte do seu sistema falhe sem destruir todo seu ecossistema de serviços. Michael

James G Silva 17 Nov 10, 2022
Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"

[AAAI22] Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification We point out the overlooked unbiasedness in long-tailed clas

PatatiPatata 28 Oct 18, 2022
a grammar based feedback fuzzer

Nautilus NOTE: THIS IS AN OUTDATE REPOSITORY, THE CURRENT RELEASE IS AVAILABLE HERE. THIS REPO ONLY SERVES AS A REFERENCE FOR THE PAPER Nautilus is a

Chair for Sys­tems Se­cu­ri­ty 158 Dec 28, 2022
Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

About the Bootcamp A 3-day Machine Learning Bootcamp organised by Department of Electronics and Communication Engineering, Santhiram Engineering Colle

MS 1 Nov 29, 2021
Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

unfoldedVBA Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution This repository contains the Pytorch implementation of the unrolled

Yunshi HUANG 2 Jul 10, 2022
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

ZJUNLP 68 Dec 28, 2022
PyTorch - Python + Nim

Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+

Giovanni Petrantoni 425 Dec 22, 2022
Simple tutorials using Google's TensorFlow Framework

TensorFlow-Tutorials Introduction to deep learning based on Google's TensorFlow framework. These tutorials are direct ports of Newmu's Theano Tutorial

Nathan Lintz 6k Jan 06, 2023
ML From Scratch

ML from Scratch MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Clustering K Nearest Neighbours Decision

Tanishq Gautam 66 Nov 02, 2022
A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 07, 2023
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 07, 2022
Mask-invariant Face Recognition through Template-level Knowledge Distillation

Mask-invariant Face Recognition through Template-level Knowledge Distillation This is the official repository of "Mask-invariant Face Recognition thro

Fadi Boutros 35 Dec 06, 2022
D2Go is a toolkit for efficient deep learning

D2Go D2Go is a production ready software system from FacebookResearch, which supports end-to-end model training and deployment for mobile platforms. W

Facebook Research 744 Jan 04, 2023
TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently.

Adversarial Chess TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently. Requirements To run

Muthu Chidambaram 30 Sep 07, 2021
Code for NeurIPS2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints"

This repository is the code for NeurIPS 2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints". Edit 2021/

10 Dec 20, 2022
Leaderboard and Visualization for RLCard

RLCard Showdown This is the GUI support for the RLCard project and DouZero project. RLCard-Showdown provides evaluation and visualization tools to hel

Data Analytics Lab at Texas A&M University 246 Dec 26, 2022
Face and Body Tracking for VRM 3D models on the web.

Kalidoface 3D - Face and Full-Body tracking for Vtubing on the web! A sequal to Kalidoface which supports Live2D avatars, Kalidoface 3D is a web app t

Rich 257 Jan 02, 2023