Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

Overview

End-to-End Optimization of Scene Layout

Teaser Image Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral)

Project site, Bibtex

For help contact afluo [a.t] andrew.cmu.edu or open an issue

  • Requirements

    • Pytorch 1.2 (for everything)
    • Neural 3D Mesh Renderer - daniilidis version (for scene refinement only) For numerical stability, please modify projection.py to remove the multiplication by 0. After the change L33, L34 looks like:
    x__ = x_
    y__ = y_ 
    
    • Blender 2.79 (for 3D rendering of rooms only)
      • Please install numpy in Blender
    • matplotlib
    • numpy
    • skimage (for SPADE based shading)
    • imageio (for SPADE based shading)
    • shapely (eval only)
    • PyWavefront (for scene refinement only, loading of 3d meshes)
    • PyMesh (for scene refnement only, remeshing of SUNCG objects)
    • 1 Nvidia GPU

Download checkpoints here, download metadata here

Project structure
|-3d_SLN
  |-data
    |-suncg_dataset.py
      # Actual definition for the dataset object, makes batches of scene graphs
  |-metadata
    # SUNCG meta data goes here
    |-30_size_info_many.json
      # data about object size/volume, for 30/70 cutoff
    |-data_rot_train.json
      # Normalized object positions & rotations for training
    |-data_rot_val.json
      # For testing
    |-size_info_many.json
      # data about object size/volume, different cutoff
    |-valid_types.json
      # What object types we should use for making the scene graph
      # Caution when editing this, quite a bit is hard coded elsewhere
  |-models
    |-diff_render.py
      # Uses the Neural Mesh Renderer (Pytorch Version) to refine object positions
    |-graph.py
      # Graph network building blocks
    |-misc.py
      # Misc helper functions for the diff renderer
    |-Sg2ScVAE_model.py
      # Code to construct the VAE-graph network
    |-SPADE_related.py
      # Tools to construct SPADE VAE GAN (inference only)
  |-options
    # Global options
  |-render
    # Contains various "profiles" for Blender rendering
  |-testing
    # You must call batch_gen in test.py at least once
    # It will call into get_layouts_from_network in test_VAE.py
    # this will compute the posterior mean & std and cache it
    |-test_acc_mean_std.py
      # Contains helper functions to measure acc/l1/std 
    |-test_heatmap.py
      # Contains the functions *produce_heatmap* and *plot_heatmap*
      # The first function takes as input a verbally defined scene graph
        # If not provided, it uses a default scene graph with 5 objects
        # It will load weights for a VAE-graph network
        # Then load the computed posterior mean & std
        # And repeatedly sample from the given scene graph
        # Saves the results to a .pkl file
      # The second function will load a .pkl and plot them as heatmaps
    |-test_plot2d.py
      # Contains a function that uses matplotlib
      # Does NOT require SUNCG
      # Plots the objects using colors provided by ScanNet
    |-test_plot3d.py
      # Calls into the blender code in the ../render folder
      # Requires the SUNCG meshes
      # Requires Blender 2.79
      # Either uses the CPU (Blender renderer)
      # Or uses the GPU (Cycles renderer)
      # Loads a HDR texture (from HDRI Haven) for background
    |-test_SPADE_shade.py
      # Loads semantic maps & depth map, and produces RGB images using SPADE
    |-test_utils.py
      # Contains helper functions for testing
        # Of interest is the *get_sg_from_words* function
    |-test_VAE.py
  |-build_dataset_model.py
     # Constructs dataset & dataloader objects
     # Also constructs the VAE-graph network
  |-test.py
     # Provides functions which performs the following:
       # generation of layouts from scene graphs under the *batch_gen* argument
       # measure the accuracy of l1 loss, accuracy, std under the *measure_acc_l1_std* argument
       # draw the topdown heatmaps of layouts with a single scene graph under the *heat_map* argument
       # plot the topdown boxes of layouts with under the *draw_2d* argument
       # plot the viewer centric layouts using suncg meshes under the *draw_3d* argument
       # perform SPADE based shading of semantic+depth maps under the *gan_shade* argument
  |-train.py
     # Contains the training loop for the VAE-graph network
  |-utils.py
     # Contains various helper functions for:
       # managing network losses
       # make scene graphs from bounding boxes
       # load/write jsons
       # misc other stuff
  • Training the VAE-graph network (limited to 1 GPU):
    python train.py

  • Testing the VAE-graph network:
    First run python test.py --batch_gen at least once. This computes and caches a posterior for future sampling using the training set. It also generates a bunch of layouts using the test set.

  • To generate a heatmap:
    python test.py --heat_map
    You can either define your own scene graph (see the produce_heatmap function in testing/test_heatmap.py), if you do not provide one it will use the default one. The function will convert scene graphs defined using words into a format usable by the network.

  • To compute STD/L1/Acc:
    python test.py --measure_acc_l1_std

  • To plot the scene from a top down view with ScanNet colors (doesn't requrie SUNCG):
    python test.py --draw_2d
    Please provide a (O+1 x 6) tensor of bounding boxes, and a (O+1,) tensor of rotations. The last object should be the bounding box of the room

  • To plot 3D
    python test.py --draw_3d
    This calls into test_plot3d.py, which in turn launched Blender, and executes render_caller.py, you can put in specific rooms by editing this file. The full rendering function is located in render_room_color.py.

  • To use a neural renderer to refine a room
    python test.py --fine_tune Please select the indexes of the room in test.py. This will call into test_render_refine.py which uses the differentiable renderer located in diff_render.py. Learning rate, and loss types/weightings can be set in test_render_refine.py.
    We set a manual seed for demonstration purposes, in practice please remove this.

  • To use SPADE to generate texture/shading/lighting for a room from semantic + depth
    python test.py --gan_shade This will first call into semantic_depth_caller.py to produce the semantic and depth maps, then use SPADE to generate RGB images.

Citation

If you find this repo useful for your research, please consider citing the paper

@inproceedings{luo2020end,
  title={End-to-End Optimization of Scene Layout},
  author={Luo, Andrew and Zhang, Zhoutong and Wu, Jiajun and Tenenbaum, Joshua B},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3754--3763},
  year={2020}
}
Owner
Andrew Luo
PhD student @ CMU
Andrew Luo
ExCon: Explanation-driven Supervised Contrastive Learning

ExCon: Explanation-driven Supervised Contrastive Learning Link to the paper: https://arxiv.org/pdf/2111.14271.pdf Contributors of this repo: Zhibo Zha

Zhibo (Darren) Zhang 18 Nov 01, 2022
Galaxy images labelled by morphology (shape). Aimed at ML development and teaching

Galaxy images labelled by morphology (shape). Aimed at ML debugging and teaching.

Mike Walmsley 14 Nov 28, 2022
GeDML is an easy-to-use generalized deep metric learning library

GeDML is an easy-to-use generalized deep metric learning library

Borui Zhang 32 Dec 05, 2022
WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

Jieyu Zhang 176 Dec 28, 2022
DAT4 - General Assembly's Data Science course in Washington, DC

DAT4 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (12/15/14 - 3/16/15). Instructors: Sinan Ozdemir

Kevin Markham 779 Dec 25, 2022
sense-py-AnishaBaishya created by GitHub Classroom

Compute Statistics Here we compute statistics for a bunch of numbers. This project uses the unittest framework to test functionality. Pass the tests T

1 Oct 21, 2021
Baselines for TrajNet++

TrajNet++ : The Trajectory Forecasting Framework PyTorch implementation of Human Trajectory Forecasting in Crowds: A Deep Learning Perspective TrajNet

VITA lab at EPFL 183 Jan 05, 2023
HAT: Hierarchical Aggregation Transformers for Person Re-identification

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 05, 2022
This tutorial repository is to introduce the functionality of KGTK to first-time users

Welcome to the KGTK notebook tutorial The goal of this tutorial repository is to introduce the functionality of KGTK to first-time users. The Knowledg

USC ISI I2 58 Dec 21, 2022
SelfRemaster: SSL Speech Restoration

SelfRemaster: Self-Supervised Speech Restoration Official implementation of SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesi

Takaaki Saeki 46 Jan 07, 2023
A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

sam4onnx A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for

Katsuya Hyodo 6 May 15, 2022
Deep learning model, heat map, data prepo

deep learning model, heat map, data prepo

Pamela Dekas 1 Jan 14, 2022
Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

RealTime Sign Language Detection using Action Recognition Approach Real-Time Sign Language is commonly predicted using models whose architecture consi

Rishikesh S 15 Aug 20, 2022
Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

LMSOC: An Approach for Socially Sensitive Pretraining Code for reproducing the paper LMSOC: An Approach for Socially Sensitive Pretraining to appear a

Twitter Research 11 Dec 20, 2022
Implementation for Homogeneous Unbalanced Regularized Optimal Transport

HUROT: An Homogeneous formulation of Unbalanced Regularized Optimal Transport. This repository provides code related to this preprint. This is an alph

Théo Lacombe 1 Feb 17, 2022
Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021) authors: Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano Overv

Facebook Research 462 Jan 03, 2023
To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types, from a Database Taken From Dr. Wolberg reports his Clinic Cases.

Astitva Veer Garg 1 Jul 31, 2022
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

ademxapp Visual applications by the University of Adelaide In designing our Model A, we did not over-optimize its structure for efficiency unless it w

Zifeng Wu 338 Dec 12, 2022
BanditPAM: Almost Linear-Time k-Medoids Clustering

BanditPAM: Almost Linear-Time k-Medoids Clustering This repo contains a high-performance implementation of BanditPAM from BanditPAM: Almost Linear-Tim

254 Dec 12, 2022
Pytorch Implementation for (STANet+ and STANet)

Pytorch Implementation for (STANet+ and STANet) V2-Weakly Supervised Visual-Auditory Saliency Detection with Multigranularity Perception (arxiv), pdf:

GuotaoWang 14 Nov 29, 2022