This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

Overview

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis

Random Sample

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis
Jiatao Gu, Lingjie Liu, Peng Wang, Christian Theobalt

Project Page | Video | Demo | Paper | Data

Abstract: We propose StyleNeRF, a 3D-aware generative model for photo-realistic high-resolution image synthesis with high multi-view consistency, which can be trained on unstructured 2D images. Existing approaches either cannot synthesize high-resolution images with fine details or yield noticeable 3D-inconsistent artifacts. In addition, many of them lack control over style attributes and explicit 3D camera poses. StyleNeRF integrates the neural radiance field (NeRF) into a style-based generator to tackle the aforementioned challenges, i.e., improving rendering efficiency and 3D consistency for high-resolution image generation. We perform volume rendering only to produce a low-resolution feature map and progressively apply upsampling in 2D to address the first issue. To mitigate the inconsistencies caused by 2D upsampling, we propose multiple designs, including a better upsampler and a new regularization loss. With these designs, StyleNeRF can synthesize high-resolution images at interactive rates while preserving 3D consistency at high quality. StyleNeRF also enables control of camera poses and different levels of styles, which can generalize to unseen views. It also supports challenging tasks, including zoom-in and-out, style mixing, inversion, and semantic editing.

Requirements

The codebase is tested on

  • Python 3.7
  • PyTorch 1.7.1
  • 8 Nvidia GPU (Tesla V100 32GB) with CUDA version 11.0

For additional python libraries, please install by:

pip install -r requirements.txt

Please refer to https://github.com/NVlabs/stylegan2-ada-pytorch for additional software/hardware requirements.

Dataset

We follow the same dataset format as StyleGAN2-ADA supported, which can be either an image folder, or a zipped file.

Pretrained Checkpoints

You can download the pre-trained checkpoints (used in our paper) and some recent variants trained with current codebase as follows:

Dataset Resolution #Params(M) Config Download
FFHQ 256 128 Default Hugging Face 🤗
FFHQ 512 148 Default Hugging Face 🤗
FFHQ 1024 184 Default Hugging Face 🤗

(I am slowly adding more checkpoints. Thanks for your very kind patience!)

Train a new StyleNeRF model

python run_train.py outdir=${OUTDIR} data=${DATASET} spec=paper512 model=stylenerf_ffhq

It will automatically detect all usable GPUs.

Please check configuration files at conf/model and conf/spec. You can always add your own model config. More details on how to use hydra configuration please follow https://hydra.cc/docs/intro/.

Render the pretrained model

python generate.py --outdir=${OUTDIR} --trunc=0.7 --seeds=${SEEDS} --network=${CHECKPOINT_PATH} --render-program="rotation_camera"

It supports different rotation trajectories for rendering new videos.

Run a demo page

python web_demo.py 21111

It will in default run a Gradio-powered demo on https://localhost:21111

[NEW] The demo is also integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo: Hugging Face Spaces

Web demo

Run a GUI visualizer

python visualizer.py

An interative application will show up for users to play with. GUI demo

Citation

@inproceedings{
    gu2022stylenerf,
    title={StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis},
    author={Jiatao Gu and Lingjie Liu and Peng Wang and Christian Theobalt},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=iUuzzTMUw9K}
}

License

Copyright © Facebook, Inc. All Rights Reserved.

The majority of StyleNeRF is licensed under CC-BY-NC, however, portions of this project are available under a separate license terms: all codes used or modified from stylegan2-ada-pytorch is under the Nvidia Source Code License.

Owner
Meta Research
Meta Research
Detect and fix skew in images containing text

Alyn Skew detection and correction in images containing text Image with skew Image after deskew Install and use via pip! Recommended way(using virtual

Kakul 230 Dec 21, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

Yukang Wang 101 Dec 12, 2022
This tool will help you convert your text to handwriting xD

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

Saurabh Daware 4.2k Jan 07, 2023
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 209 Dec 06, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

81 Dec 01, 2022
An OCR evaluation tool

dinglehopper dinglehopper is an OCR evaluation tool and reads ALTO, PAGE and text files. It compares a ground truth (GT) document page with a OCR resu

QURATOR-SPK 40 Dec 20, 2022
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
Converts an image into funny, smaller amongus characters

SussyImage Converts an image into funny, smaller amongus characters Demo Mona Lisa | Lona Misa (Made up of AmongUs characters) API I've also added an

Dhravya Shah 14 Aug 18, 2022
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream vid

Peace 10 Jun 30, 2021
How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

Real Time Object Recognition From your Screen Desktop . In this post, I will explain how to build a simply program to detect objects from you desktop

Ruslan Magana Vsevolodovna 2 Sep 28, 2022
A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

tooraj taraz 3 Feb 10, 2022
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 06, 2022
An organized collection of tutorials and projects created for aspriring computer vision students.

A repository created with the purpose of teaching students in BME lab 308A- Hanoi University of Science and Technology

Givralnguyen 5 Nov 24, 2021
Automatically remove the mosaics in images and videos, or add mosaics to them.

Automatically remove the mosaics in images and videos, or add mosaics to them.

Hypo 1.4k Dec 30, 2022
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf # it's a scriptable c

jbarlow83 7.9k Jan 03, 2023
Geometric Augmentation for Text Image

Text Image Augmentation A general geometric augmentation tool for text images in the CVPR 2020 paper "Learn to Augment: Joint Data Augmentation and Ne

Canjie Luo 440 Jan 05, 2023
基于图像识别的开源RPA工具,理论上可以支持所有windows软件和网页的自动化

SimpleRPA 基于图像识别的开源RPA工具,理论上可以支持所有windows软件和网页的自动化 简介 SimpleRPA是一款python语言编写的开源RPA工具(桌面自动控制工具),用户可以通过配置yaml格式的文件,来实现桌面软件的自动化控制,简化繁杂重复的工作,比如运营人员给用户发消息,

Song Hui 7 Jun 26, 2022