[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Overview

GNeRF

This repository contains official code for the ICCV 2021 paper: GNeRF: GAN-based Neural Radiance Field without Posed Camera. This implementation is written in Pytorch.

architecture

Abstract

We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field (NeRF) reconstruction for the complex scenarios with unknown and even randomly initialized camera poses. Recent NeRF-based advances have gained popularity for remarkable realistic novel view synthesis. However, most of them heavily rely on accurate camera poses estimation, while few recent methods can only optimize the unknown camera poses in roughly forward-facing scenes with relatively short camera trajectories and require rough camera poses initialization. Differently, our GNeRF only utilizes randomly initialized poses for complex outside-in scenarios. We propose a novel two-phases end-to-end framework. The first phase takes the use of GANs into the new realm for optimizing coarse camera poses and radiance fields jointly, while the second phase refines them with additional photometric loss. We overcome local minima using a hybrid and iterative optimization scheme. Extensive experiments on a variety of synthetic and natural scenes demonstrate the effectiveness of GNeRF. More impressively, our approach outperforms the baselines favorably in those scenes with repeated patterns or even low textures that are regarded as extremely challenging before.

Installation

We recommand using Anaconda to setup the environment. Run the following commands:

# Create a conda environment named 'gnerf'
conda create --name gnerf python=3.7
# Activate the environment
conda activate gnerf
# Install requirements
pip install -r requirements.txt

Data

Blender

Download from the NeRF official Google Drive . Please download and unzip nerf_synthetic.zip.

DTU

Download the preprocessed DTU training data from original MVSNet repo and unzip. We also provide a few DTU examples for fast testing.

Your own data

We share some advices on preparing your own dataset and setting related parameters:

  • Pose sampling space should be close to the data: Our method requires a reasonable prior pose distribution.
  • The training may fail to converge on symmetrical scenes: The inversion network can not map an image to different poses.

Running

python train.py ./config/CONFIG.yaml --data_dir PATH/TO/DATASET

where you replace CONFIG.yaml with your config file (blender.yaml for blender dataset and dtu.yaml for DTU dataset). You can optionally monitor on the training process using tensorboard by adding --open_tensorboard argument. The default setting takes around 13GB GPU memory. After 40k iterations, you should get a video like these:

Evaluation

python eval.py --ckpt PATH/TO/CKPT.pt --gt PATH/TO/GT.json 

where you replace PATH/TO/CKPT.pt with your trained model checkpoint, and PATH/TO/GT.json with the json file in NeRF-Synthetic dataset. Then, just run the ATE toolbox on the evaluation directory.

List of Possible Improvements

For future work, we recommend the following aspects to further improve the performance and stability:

  • Replace the single NeRF network with mip-NeRF network: The use of separate MLPs in the original NeRF paper is a key detail to represent thin objects in the scene, if you retrain the original NeRF with only one MLP you will find a decrease in performance. While in our work, a single MLP network is necessary to keep the coarse image and fine image aligned. The cone casting and IPE features of mip-NeRF allow it to explicitly encode scale into the input features and thereby enable an MLP to learn a multiscale representation of the scene.

  • Combine BARF to further overcome local minima: The BARF method shows that susceptibility to noise from positional encoding affects the basin of attraction for registration and present a coarse-to-fine registration strategy.

  • Combine NeRF++ to represent the background in real scenes with complex background.

Citation

If you find our code or paper useful, please consider citing

@InProceedings{meng2021gnerf,
    author = {Meng, Quan and Chen, Anpei and Luo, Haimin and Wu, Minye and Su, Hao and Xu, Lan and He, Xuming and Yu, Jingyi},
    title = {{G}{N}e{R}{F}: {G}{A}{N}-based {N}eural {R}adiance {F}ield without {P}osed {C}amera},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year = {2021}
}

Some code snippets are borrowed from GRAF and nerf_pl. Thanks for these great projects.

Owner
Quan Meng
Quan Meng
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation Source code for TACL 2021 paper KEPLER: A Unified Model for Kn

THU-KEG 138 Dec 22, 2022
A basic neural network for image segmentation.

Unet_erythema_detection A basic neural network for image segmentation. 前期准备 1.在logs文件夹中下载h5权重文件,百度网盘链接在logs文件夹中 2.将所有原图 放置在“/dataset_1/JPEGImages/”文件夹

1 Jan 16, 2022
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well i

0 Sep 06, 2022
Tool cek opsi checkpoint facebook!

tool apa ini? cek_opsi_facebook adalah sebuah tool yang mengecek opsi checkpoint akun facebook yang terkena checkpoint! tujuan dibuatnya tool ini? too

Muhammad Latif Harkat 2 Jul 17, 2022
Constructing interpretable quadratic accuracy predictors to serve as an objective function for an IQCQP problem that represents NAS under latency constraints and solve it with efficient algorithms.

IQNAS: Interpretable Integer Quadratic programming Neural Architecture Search Realistic use of neural networks often requires adhering to multiple con

0 Oct 24, 2021
Corruption Invariant Learning for Re-identification

Corruption Invariant Learning for Re-identification The official repository for Benchmarks for Corruption Invariant Person Re-identification (NeurIPS

Minghui Chen 73 Dec 08, 2022
A very impractical 3D rendering engine that runs in the python terminal.

Terminal-3D-Render A very impractical 3D rendering engine that runs in the python terminal. do NOT try to run this program using the standard python I

23 Dec 31, 2022
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

27 Dec 10, 2022
Flappy bird automation using Neuroevolution of Augmenting Topologies (NEAT) in Python

FlappyAI Flappy bird automation using Neuroevolution of Augmenting Topologies (NEAT) in Python Everything Used Genetic Algorithm especially NEAT conce

Eryawan Presma Y. 2 Mar 24, 2022
MEND: Model Editing Networks using Gradient Decomposition

MEND: Model Editing Networks using Gradient Decomposition Setup Environment This codebase uses Python 3.7.9. Other versions may work as well. Create a

Eric Mitchell 141 Dec 02, 2022
Measuring Coding Challenge Competence With APPS

Measuring Coding Challenge Competence With APPS This is the repository for Measuring Coding Challenge Competence With APPS by Dan Hendrycks*, Steven B

Dan Hendrycks 218 Dec 27, 2022
Leaf: Multiple-Choice Question Generation

Leaf: Multiple-Choice Question Generation Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. The applicat

Kristiyan Vachev 62 Dec 20, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
TAug :: Time Series Data Augmentation using Deep Generative Models

TAug :: Time Series Data Augmentation using Deep Generative Models Note!!! The package is under development so be careful for using in production! Fea

35 Dec 06, 2022
Using PyTorch Perform intent classification using three different models to see which one is better for this task

Using PyTorch Perform intent classification using three different models to see which one is better for this task

Yoel Graumann 1 Feb 14, 2022
Topic Modelling for Humans

gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ

RARE Technologies 13.8k Jan 03, 2023
Greedy Gaussian Segmentation

GGS Greedy Gaussian Segmentation (GGS) is a Python solver for efficiently segmenting multivariate time series data. For implementation details, please

Stanford University Convex Optimization Group 72 Dec 07, 2022
CAR-API: Cityscapes Attributes Recognition API

CAR-API: Cityscapes Attributes Recognition API This is the official api to download and fetch attributes annotations for Cityscapes Dataset. Content I

Kareem Metwaly 5 Dec 22, 2022
Tensors and neural networks in Haskell

Hasktorch Hasktorch is a library for tensors and neural networks in Haskell. It is an independent open source community project which leverages the co

hasktorch 920 Jan 04, 2023
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

72 Jan 03, 2023