GRF: Learning a General Radiance Field for 3D Representation and Rendering

Related tags

Deep LearningGRF
Overview

GRF: Learning a General Radiance Field for 3D Representation and Rendering

[Paper] [Video]

GRF: Learning a General Radiance Field for 3D Representation and Rendering
Alex Trevithick1,2 and Bo Yang2,3
1Williams College, 2University of Oxford, 3The Hong Kong Polytechnic University in ICCV 2021

This is the codebase which is currently a work in progress.

Overview of GRF

GRF is a powerful implicit neural function that can represent and render arbitrarily complex 3D scenes in a single network only from 2D observations. GRF takes a set of posed 2D images as input, constructs an internal representation for each 3D point of the scene, and renders the corresponding appearance and geometry of any 3D point viewing from an arbitrary angle. The key to our approach is to explicitly integrate the principle of multi-view geometry to obtain features representative of an entire ray from a given viewpoint. Thus, in a single forward pass to render a scene from a novel view, GRF takes some views of that scene as input, computes per-pixel pose-aware features for each ray from the given viewpoints through the image plane at that pixel, and then uses those features to predict the volumetric density and rgb values of points in 3D space. Volumetric rendering is then applied.

Setting Up the Environment

Use conda to setup an environment as follows:

conda env create -f environment.yml
conda activate grf

Data

  • SRN cars and chairs datasets can be downloaded from the paper's drive link
  • NeRF-Synthetic and LLFF datasets can be downloaded from the NeRF drive link
  • MultiShapenet dataset can be downloaded from the DISN drive link

Training and Rendering from the Model

To train and render from the model, use the run.py script

python run.py --data_root [path to directory with dataset] ] \
    --expname [experiment name]
    --basedir [where to store ckpts and logs]
    --datadir [input data directory]
    --netdepth [layers in network]
    --netwidth [channels per layer]
    --netdepth_fine [layers in fine network]
    --netwidth_fine [channels per layer in fine network]
    --N_rand [batch size (number of random rays per gradient step)]
    --lrate [learning rate]
    --lrate_decay [exponential learning rate decay (in 1000s)]
    --chunk [number of rays processed in parallel, decrease if running out of memory]
    --netchunk [number of pts sent through network in parallel, decrease if running out of memory]
    --no_batching [only take random rays from 1 image at a time]
    --no_reload [do not reload weights from saved ckpt]
    --ft_path [specific weights npy file to reload for coarse network]
    --random_seed [fix random seed for repeatability]
    --precrop_iters [number of steps to train on central crops]
    --precrop_frac [fraction of img taken for central crops]
    --N_samples [number of coarse samples per ray]
    --N_importance [number of additional fine samples per ray]
    --perturb [set to 0. for no jitter, 1. for jitter]
    --use_viewdirs [use full 5D input instead of 3D]
    --i_embed [set 0 for default positional encoding, -1 for none]
    --multires [log2 of max freq for positional encoding (3D location)]
    --multires_views [log2 of max freq for positional encoding (2D direction)]
    --raw_noise_std [std dev of noise added to regularize sigma_a output, 1e0 recommended]
    --render_only [do not optimize, reload weights and render out render_poses path]
    --dataset_type [options: llff / blender / shapenet / multishapenet]
    --testskip [will load 1/N images from test/val sets, useful for large datasets like deepvoxels]
    --white_bkgd [set to render synthetic data on a white bkgd (always use for dvoxels)]
    --half_res [load blender synthetic data at 400x400 instead of 800x800]
    --no_ndc [do not use normalized device coordinates (set for non-forward facing scenes)]
    --lindisp [sampling linearly in disparity rather than depth]
    --spherify [set for spherical 360 scenes]
    --llffhold [will take every 1/N images as LLFF test set, paper uses 8]
    --i_print [frequency of console printout and metric loggin]
    --i_img [frequency of tensorboard image logging]
    --i_weights [frequency of weight ckpt saving]
    --i_testset [frequency of testset saving]
    --i_video [frequency of render_poses video saving]
    --attention_direction_multires [frequency of embedding for value]
    --attention_view_multires [frequency of embedding for direction]
    --training_recon [whether to render images from the test set or not during final evaluation]
    --use_quaternion [append input pose as quaternion to input to unet]
    --no_globl [don't use global vector in middle of unet]
    --no_render_pose [append render pose to input to unet]
    --use_attsets [use attsets, otherwise use slot attention]

In particular, note that to render and test from a trained model, set render_only to True in the config.

Configs

The current configs are for the blender, LLFF, and shapenet datasets, which can be found in configs.

After setting the parameters of the model, to run it,

python run.py --configs/config_DATATYPE

Practical Concerns

The models were tested on 32gb GPUs, and higher resolution images require very large amounts of memory. The shapenet experiments should run on 16gb GPUs.

Acknowledgements

The code is built upon the original NeRF implementation. Thanks to LucidRains for the torch implementation of slot attention on which the current version is based.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{grf2020,
  title={GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering},
  author={Trevithick, Alex and Yang, Bo},
  booktitle={arXiv:2010.04595},
  year={2020}
}
Owner
Alex Trevithick
ML + CV👍
Alex Trevithick
Dynamica causal Bayesian optimisation

Dynamic Causal Bayesian Optimization This is a Python implementation of Dynamic Causal Bayesian Optimization as presented at NeurIPS 2021. Abstract Th

nd308 18 Nov 22, 2022
QICK: Quantum Instrumentation Control Kit

QICK: Quantum Instrumentation Control Kit The QICK is a kit of firmware and software to use the Xilinx RFSoC to control quantum systems. It consists o

81 Dec 15, 2022
A simple python stock Predictor

Python Stock Predictor A simple python stock Predictor Demo Run Locally Clone the project git clone https://github.com/yashraj-n/stock-price-predict

Yashraj narke 5 Nov 29, 2021
The implementation of PEMP in paper "Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes"

Prior-Enhanced network with Meta-Prototypes (PEMP) This is the PyTorch implementation of PEMP. Overview of PEMP Meta-Prototypes & Adaptive Prototypes

Jianwei ZHANG 8 Oct 14, 2021
Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

About the Bootcamp A 3-day Machine Learning Bootcamp organised by Department of Electronics and Communication Engineering, Santhiram Engineering Colle

MS 1 Nov 29, 2021
FwordCTF 2021 Infrastructure and Source code of Web/Bash challenges

FwordCTF 2021 You can find here the source code of the challenges I wrote (Web and Bash) in FwordCTF 2021 and the source code of the platform with our

Kahla 5 Nov 25, 2022
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 145 Jan 01, 2023
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized

VQGAN-CLIP-Docker About Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized This is a stripped and minimal dependency repository for running loca

Kevin Costa 73 Sep 11, 2022
Companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsura et al.

META-RS This is the companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsu

Bosch Research 7 Dec 09, 2022
End-to-end beat and downbeat tracking in the time domain.

WaveBeat End-to-end beat and downbeat tracking in the time domain. | Paper | Code | Video | Slides | Setup First clone the repo. git clone https://git

Christian J. Steinmetz 60 Dec 24, 2022
An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

MetaICL: Learning to Learn In Context This includes an original implementation of "MetaICL: Learning to Learn In Context" by Sewon Min, Mike Lewis, Lu

Meta Research 141 Jan 07, 2023
Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

Semi-supervised Transfer Learning for Image Rain Removal This package contains the Python implementation of "Semi-supervised Transfer Learning for Ima

Wei Wei 59 Dec 26, 2022
This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

slue-toolkit We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE

ASAPP Research 39 Sep 21, 2022
A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

75 Dec 27, 2022
PyTorch implementation of our method for adversarial attacks and defenses in hyperspectral image classification.

Self-Attention Context Network for Hyperspectral Image Classification PyTorch implementation of our method for adversarial attacks and defenses in hyp

22 Dec 02, 2022
Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

VoCapXLM Code for EMNLP2021 paper Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training Environment DockerFile: dancingso

Bo Zheng 15 Jul 28, 2022
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
A testcase generation tool for Persistent Memory Programs.

PMFuzz PMFuzz is a testcase generation tool to generate high-value tests cases for PM testing tools (XFDetector, PMDebugger, PMTest and Pmemcheck) If

Systems Research at ShiftLab 14 Jul 24, 2022
Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is our implementation for the paper: Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation Xiaokun Zhang, Bo

Xiaokun Zhang 27 Dec 02, 2022
Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Wonjong Jang 8 Nov 01, 2022