Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"

Overview

Light Field Networks

Project Page | Paper | Data | Pretrained Models

Vincent Sitzmann*, Semon Rezchikov*, William Freeman, Joshua Tenenbaum, Frédo Durand
MIT, *denotes equal contribution

This is the official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering".

lfns_video

Get started

You can set up a conda environment with all dependencies like so:

conda env create -f environment.yml
conda activate siren

High-Level structure

The code is organized as follows:

  • multiclass_dataio.py and dataio.py contain the dataio for mutliclass- and single-class experiments respectively.
  • models.py contains the code for light field networks.
  • training.py contains a generic training routine.
  • ./experiment_scripts/ contains scripts to reproduce experiments in the paper.

Reproducing experiments

The directory experiment_scripts contains one script per experiment in the paper.

train_single_class.py trains a model on classes in the Scene Representation Networks format, such as cars or chairs. Note that since these datasets have a resolution of 128, this model starts with a lower resolution (64) and then increases the resolution to 128 (see line 43 in the script).

train_nmr.py trains a model on the NMR dataset. An example call is:

python experiment_scripts/train_nmr.py --data_root=path_to_nmr_dataset
python experiment_scripts/train_single_class.py --data_root=path_to_single_class

To reconstruct test objects, use the scripts "rec_single_class.py" and "rec_nmr.py". In addition to the data root, you have to point these scripts to the checkpoint from the training run. Note that the rec_nmr.py script uses the viewlist under ./experiment_scripts/viewlists/src_dvr.txt to pick which views to reconstruct the objects from, while rec_single_class.py per default reconstructs from the view with index 64.

python experiment_scripts/rec_nmr.py --data_root=path_to_nmr_dataset --checkpoint=path_to_training_checkpoint
python experiment_scripts/rec_single_class.py --data_root=path_to_single_class_TEST_SET --checkpoint=path_to_training_checkpoint

Finally, you may test the models on the test set with the test.py script. This script is used for testing all the models. You have to pass it as a parameter which dataset you are reconstructing ("NMR" or no). For the NMR dataset, you need to pass the "viewlist" again to make sure that the model is not evaluated on the context view.

python experiment_scripts/test.py --data_root=path_to_nmr_dataset --dataset=NMR --checkpoint=path_to_rec_checkpoint
python experiment_scripts/test.py --data_root=path_to_single_class_TEST_SET --dataset=single --checkpoint=path_to_rec_checkpoint

To monitor progress, both the training and reconstruction scripts write tensorboard summaries into a "summaries" subdirectory in the logging_root.

Bells & whistles

This code has a bunch of options that were not discussed in the paper.

  • switch between a ReLU network and a SIREN to better fit high-frequency content with the flag --network (see the init of model.py for options).
  • switch between a hypernetwork, conditioning via concatenation, and low-rank concditioning with the flag --conditioning
  • there is an implementation of encoder-based inference in models.py (LFEncoder) which uses a ResNet18 with global conditioning to generate the latent codes z.

Data

We use two types of datasets: class-specific ones and multi-class ones.

Coordinate and camera parameter conventions

This code uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates.

Misc

Citation

If you find our work useful in your research, please cite:

@inproceedings{sitzmann2021lfns,
               author = {Sitzmann, Vincent
                         and Rezchikov, Semon
                         and Freeman, William T.
                         and Tenenbaum, Joshua B.
                         and Durand, Fredo},
               title = {Light Field Networks: Neural Scene Representations
                        with Single-Evaluation Rendering},
               booktitle = {Proc. NeurIPS},
               year={2021}
            }

Contact

If you have any questions, please email Vincent Sitzmann at [email protected].

Owner
Vincent Sitzmann
Incoming Assistant Professor @mit EECS. I'm researching neural scene representations - the way neural networks learn to represent information on our world.
Vincent Sitzmann
An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

FIDNet_SemanticKITTI Motivation Implementing complicated network modules with only one or two points improvement on hardware is tedious. So here we pr

YimingZhao 54 Dec 12, 2022
Can we learn gradients by Hamiltonian Neural Networks?

Can we learn gradients by Hamiltonian Neural Networks? This project was carried out as part of the Optimization for Machine Learning course (CS-439) a

2 Aug 22, 2022
It helps user to learn Pick-up lines and share if he has a better one

Pick-up-Lines-Generator(Open Source) It helps user to learn Pick-up lines Share and Add one or many to the DataBase Unique SQLite DataBase AI Undercon

knock_nott 0 May 04, 2022
Parsing, analyzing, and comparing source code across many languages

Semantic semantic is a Haskell library and command line tool for parsing, analyzing, and comparing source code. In a hurry? Check out our documentatio

GitHub 8.6k Dec 28, 2022
modelvshuman is a Python library to benchmark the gap between human and machine vision

modelvshuman is a Python library to benchmark the gap between human and machine vision. Using this library, both PyTorch and TensorFlow models can be evaluated on 17 out-of-distribution datasets with

Bethge Lab 244 Jan 03, 2023
PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

Zixuan Ke 176 Jan 05, 2023
A robust pointcloud registration pipeline based on correlation.

PHASER: A Robust and Correspondence-Free Global Pointcloud Registration Ubuntu 18.04+ROS Melodic: Overview Pointcloud registration using correspondenc

ETHZ ASL 101 Dec 01, 2022
Details about the wide minima density hypothesis and metrics to compute width of a minima

wide-minima-density-hypothesis Details about the wide minima density hypothesis and metrics to compute width of a minima This repo presents the wide m

Nikhil Iyer 9 Dec 27, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 28 Nov 25, 2022
Safe Control for Black-box Dynamical Systems via Neural Barrier Certificates

Safe Control for Black-box Dynamical Systems via Neural Barrier Certificates Installation Clone the repository: git clone https://github.com/Zengyi-Qi

Zengyi Qin 3 Oct 18, 2022
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

EdiTTS: Score-based Editing for Controllable Text-to-Speech Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech. Au

Neosapience 98 Dec 25, 2022
MolRep: A Deep Representation Learning Library for Molecular Property Prediction

MolRep: A Deep Representation Learning Library for Molecular Property Prediction Summary MolRep is a Python package for fairly measuring algorithmic p

AI-Health @NSCC-gz 83 Dec 24, 2022
The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

FQ-ViT [arXiv] This repo contains the official implementation of "FQ-ViT: Fully Quantized Vision Transformer without Retraining". Table of Contents In

132 Jan 08, 2023
Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

MLP Mixer Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision. Give us a star if you like this repo. Author: Github: bangoc123 Emai

Ngoc Nguyen Ba 86 Dec 10, 2022
Cookiecutter PyTorch Lightning

Cookiecutter PyTorch Lightning Instructions # install cookiecutter pip install cookiecutter

Mazen 8 Nov 06, 2022
A python package to perform same transformation to coco-annotation as performed on the image.

coco-transform-util A python package to perform same transformation to coco-annotation as performed on the image. Installation Way 1 $ git clone https

1 Jan 14, 2022
ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Zongdai 107 Dec 20, 2022
NBEATSx: Neural basis expansion analysis with exogenous variables

NBEATSx: Neural basis expansion analysis with exogenous variables We extend the NBEATS model to incorporate exogenous factors. The resulting method, c

Cristian Challu 100 Dec 31, 2022
The end-to-end platform for building voice products at scale

Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog

Picovoice 318 Jan 07, 2023
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Don’t be Contradicted with Anything!CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System This repository contains the PyTorch im

Libo Qin 25 Sep 06, 2022