an implementation of 3D Ken Burns Effect from a Single Image using PyTorch

Overview

3d-ken-burns

This is a reference implementation of 3D Ken Burns Effect from a Single Image [1] using PyTorch. Given a single input image, it animates this still image with a virtual camera scan and zoom subject to motion parallax. Should you be making use of our work, please cite our paper [1].

Paper

setup

Several functions are implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository. Please also make sure to have the CUDA_HOME environment variable configured.

In order to generate the video results, please also make sure to have pip install moviepy installed.

usage

To run it on an image and generate the 3D Ken Burns effect fully automatically, use the following command.

python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4

To start the interface that allows you to manually adjust the camera path, use the following command. You can then navigate to http://localhost:8080/ and load an image using the button on the bottom right corner. Please be patient when loading an image and saving the result, there is a bit of background processing going on.

python interface.py

To run the depth estimation to obtain the raw depth estimate, use the following command. Please note that this script does not perform the depth adjustment, see #22 for information on how to add it.

python depthestim.py --in ./images/doublestrike.jpg --out ./depthestim.npy

To benchmark the depth estimation, run python benchmark-ibims.py or python benchmark-nyu.py. You can use it to easily verify that the provided implementation runs as expected.

colab

If you do not have a suitable environment to run this projects then you could give Colab a try. It allows you to run the project in the cloud, free of charge. There are several people who provide Colab notebooks that should get you started. A few that I am aware of include one from Arnaldo Gabriel, one from Vlad Alex, and one from Ahmed Harmouche.

dataset

This dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License (CC BY-NC-SA 4.0) and may only be used for non-commercial purposes. Please see the LICENSE file for more information.

scene mode color depth normal
asdf flying 3.7 GB 1.0 GB 2.9 GB
asdf walking 3.6 GB 0.9 GB 2.7 GB
blank flying 3.2 GB 1.0 GB 2.8 GB
blank walking 3.0 GB 0.9 GB 2.7 GB
chill flying 5.4 GB 1.1 GB 10.8 GB
chill walking 5.2 GB 1.0 GB 10.5 GB
city flying 0.8 GB 0.2 GB 0.9 GB
city walking 0.7 GB 0.2 GB 0.8 GB
environment flying 1.9 GB 0.5 GB 3.5 GB
environment walking 1.8 GB 0.5 GB 3.3 GB
fort flying 5.0 GB 1.1 GB 9.2 GB
fort walking 4.9 GB 1.1 GB 9.3 GB
grass flying 1.1 GB 0.2 GB 1.9 GB
grass walking 1.1 GB 0.2 GB 1.6 GB
ice flying 1.2 GB 0.2 GB 2.1 GB
ice walking 1.2 GB 0.2 GB 2.0 GB
knights flying 0.8 GB 0.2 GB 1.0 GB
knights walking 0.8 GB 0.2 GB 0.9 GB
outpost flying 4.8 GB 1.1 GB 7.9 GB
outpost walking 4.6 GB 1.0 GB 7.4 GB
pirates flying 0.8 GB 0.2 GB 0.8 GB
pirates walking 0.7 GB 0.2 GB 0.8 GB
shooter flying 0.9 GB 0.2 GB 1.1 GB
shooter walking 0.9 GB 0.2 GB 1.0 GB
shops flying 0.2 GB 0.1 GB 0.2 GB
shops walking 0.2 GB 0.1 GB 0.2 GB
slums flying 0.5 GB 0.1 GB 0.8 GB
slums walking 0.5 GB 0.1 GB 0.7 GB
subway flying 0.5 GB 0.1 GB 0.9 GB
subway walking 0.5 GB 0.1 GB 0.9 GB
temple flying 1.7 GB 0.4 GB 3.1 GB
temple walking 1.7 GB 0.3 GB 2.8 GB
titan flying 6.2 GB 1.1 GB 11.5 GB
titan walking 6.0 GB 1.1 GB 11.3 GB
town flying 1.7 GB 0.3 GB 3.0 GB
town walking 1.8 GB 0.3 GB 3.0 GB
underland flying 5.4 GB 1.2 GB 12.1 GB
underland walking 5.1 GB 1.2 GB 11.4 GB
victorian flying 0.5 GB 0.1 GB 0.8 GB
victorian walking 0.4 GB 0.1 GB 0.7 GB
village flying 1.6 GB 0.3 GB 2.8 GB
village walking 1.6 GB 0.3 GB 2.7 GB
warehouse flying 0.9 GB 0.2 GB 1.5 GB
warehouse walking 0.8 GB 0.2 GB 1.4 GB
western flying 0.8 GB 0.2 GB 0.9 GB
western walking 0.7 GB 0.2 GB 0.8 GB

Please note that this is an updated version of the dataset that we have used in our paper. So while it has fewer scenes in total, each sample capture now has a varying focal length which should help with generalizability. Furthermore, some examples are either over- or under-exposed and it would be a good idea to remove these outliers. Please see #37, #39, and #40 for supplementary discussions.

video

Video

license

This is a project by Adobe Research. It is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License (CC BY-NC-SA 4.0) and may only be used for non-commercial purposes. Please see the LICENSE file for more information.

references

[1]  @article{Niklaus_TOG_2019,
         author = {Simon Niklaus and Long Mai and Jimei Yang and Feng Liu},
         title = {3D Ken Burns Effect from a Single Image},
         journal = {ACM Transactions on Graphics},
         volume = {38},
         number = {6},
         pages = {184:1--184:15},
         year = {2019}
     }

acknowledgment

The video above uses materials under a Creative Common license or with the owner's permission, as detailed at the end.

Owner
Simon Niklaus
Research Scientist at Adobe
Simon Niklaus
Memory-Augmented Model Predictive Control

Memory-Augmented Model Predictive Control This repository hosts the source code for the journal article "Composing MPC with LQR and Neural Networks fo

Fangyu Wu 1 Jun 19, 2022
Bringing Computer Vision and Flutter together , to build an awesome app !!

Bringing Computer Vision and Flutter together , to build an awesome app !! Explore the Directories Flutter · Machine Learning Table of Contents About

Padmanabha Banerjee 14 Apr 07, 2022
Official repository for the paper "Instance-Conditioned GAN"

Official repository for the paper "Instance-Conditioned GAN" by Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michał Drożdżal, Adriana Romero-Soriano.

Facebook Research 510 Dec 30, 2022
2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

TFill arXiv | Project This repository implements the training, testing and editing tools for "Bridging Global Context Interactions for High-Fidelity I

Chuanxia Zheng 111 Jan 08, 2023
'Solving the sampling problem of the Sycamore quantum supremacy circuits

solve_sycamore This repo contains data, contraction code, and contraction order for the paper ''Solving the sampling problem of the Sycamore quantum s

Feng Pan 29 Nov 28, 2022
Neural models of common sense. 🤖

Unicorn on Rainbow Neural models of common sense. This repository is for the paper: Unicorn on Rainbow: A Universal Commonsense Reasoning Model on a N

AI2 60 Jan 05, 2023
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

4 Feb 24, 2022
This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.

Rotate-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes. Section I. Description The codes are

xinzelee 90 Dec 13, 2022
StellarGraph - Machine Learning on Graphs

StellarGraph Machine Learning Library StellarGraph is a Python library for machine learning on graphs and networks. Table of Contents Introduction Get

S T E L L A R 2.6k Jan 05, 2023
MapReader: A computer vision pipeline for the semantic exploration of maps at scale

MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b

Living with Machines 25 Dec 26, 2022
Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

Weakly Supervised Segmentation with TensorFlow This repo contains a TensorFlow implementation of weakly supervised instance segmentation as described

Phil Ferriere 220 Dec 13, 2022
Robust & Reliable Route Recommendation on Road Networks

NeuroMLR: Robust & Reliable Route Recommendation on Road Networks This repository is the official implementation of NeuroMLR: Robust & Reliable Route

4 Dec 20, 2022
Image Fusion Transformer

Image-Fusion-Transformer Platform Python 3.7 Pytorch =1.0 Training Dataset MS-COCO 2014 (T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ram

Vibashan VS 68 Dec 23, 2022
N-RPG - Novel role playing game da turfu

N-RPG Ce README sera la page de garde du projet. Contenu Il contiendra la présen

4 Mar 15, 2022
Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti

benfeng 69 Nov 15, 2022
HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep.

HODEmu HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep. and emulates satellite abundance as a function of co

Antonio Ragagnin 1 Oct 13, 2021
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer

HV-plane reconstruction from a single 360 image Code for our paper in CVPR 2021: Indoor Panorama Planar 3D Reconstruction via Divide and Conquer (pape

sunset 36 Jan 03, 2023
AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022
Rest API Written In Python To Classify NSFW Images.

Rest API Written In Python To Classify NSFW Images.

Wahyusaputra 2 Dec 23, 2021
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

Kang Liao 18 Dec 23, 2022