Tensorflow 2 implementation of our high quality frame interpolation neural network

Overview

FILM: Frame Interpolation for Large Scene Motion

Project | Paper | YouTube | Benchmark Scores

Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn't use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone.

FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
Google Research
Technical Report 2022.

A sample 2 seconds moment. FILM transforms near-duplicate photos into a slow motion footage that look like it is shot with a video camera.

Installation

  • Get Frame Interpolation source codes
> git clone https://github.com/google-research/frame-interpolation frame_interpolation
  • Optionally, pull the recommended Docker base image
> docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-6:latest
  • Install dependencies
> pip install -r frame_interpolation/requirements.txt
> apt-get install ffmpeg

Pre-trained Models

  • Create a directory where you can keep large files. Ideally, not in this directory.
> mkdir 
   

   
  • Download pre-trained TF2 Saved Models from google drive and put into .

The downloaded folder should have the following structure:

pretrained_models/
├── film_net/
│   ├── L1/
│   ├── VGG/
│   ├── Style/
├── vgg/
│   ├── imagenet-vgg-verydeep-19.mat

Running the Codes

The following instructions run the interpolator on the photos provided in frame_interpolation/photos.

One mid-frame interpolation

To generate an intermediate photo from the input near-duplicate photos, simply run:

> python3 -m frame_interpolation.eval.interpolator_test \
     --frame1 frame_interpolation/photos/one.png \
     --frame2 frame_interpolation/photos/two.png \
     --model_path 
   
    /film_net/Style/saved_model \
     --output_frame frame_interpolation/photos/middle.png \

   

This will produce the sub-frame at t=0.5 and save as 'frame_interpolation/photos/middle.png'.

Many in-between frames interpolation

Takes in a set of directories identified by a glob (--pattern). Each directory is expected to contain at least two input frames, with each contiguous frame pair treated as an input to generate in-between frames.

/film_net/Style/saved_model \ --times_to_interpolate 6 \ --output_video">
> python3 -m frame_interpolation.eval.interpolator_cli \
     --pattern "frame_interpolation/photos" \
     --model_path 
   
    /film_net/Style/saved_model \
     --times_to_interpolate 6 \
     --output_video

   

You will find the interpolated frames (including the input frames) in 'frame_interpolation/photos/interpolated_frames/', and the interpolated video at 'frame_interpolation/photos/interpolated.mp4'.

The number of frames is determined by --times_to_interpolate, which controls the number of times the frame interpolator is invoked. When the number of frames in a directory is 2, the number of output frames will be 2^times_to_interpolate+1.

Datasets

We use Vimeo-90K as our main training dataset. For quantitative evaluations, we rely on commonly used benchmark datasets, specifically:

Creating a TFRecord

The training and benchmark evaluation scripts expect the frame triplets in the TFRecord storage format.

We have included scripts that encode the relevant frame triplets into a tf.train.Example data format, and export to a TFRecord file.

You can use the commands python3 -m frame_interpolation.datasets.create_ _tfrecord --help for more information.

For example, run the command below to create a TFRecord for the Middlebury-other dataset. Download the images and point --input_dir to the unzipped folder path.

> python3 -m frame_interpolation.datasets.create_middlebury_tfrecord \
    --input_dir=
   
     \
    --output_tfrecord_filepath=
    

   

Training

Below are our training gin configuration files for the different loss function:

frame_interpolation/training/
├── config/
│   ├── film_net-L1.gin
│   ├── film_net-VGG.gin
│   ├── film_net-Style.gin

To launch a training, simply pass the configuration filepath to the desired experiment.
By default, it uses all visible GPUs for training. To debug or train on a CPU, append --mode cpu.

> python3 -m frame_interpolation.training.train \
     --gin_config frame_interpolation/training/config/
   
    .gin \
     --base_folder 
     \
     --label 
    

    
   
  • When training finishes, the folder structure will look like this:

   
    /
├── 
    
   

Build a SavedModel

Optionally, to build a SavedModel format from a trained checkpoints folder, you can use this command:

> python3 -m frame_interpolation.training.build_saved_model_cli \
     --base_folder  \
     --label 
   

   
  • By default, a SavedModel is created when the training loop ends, and it will be saved at / .

Evaluation on Benchmarks

Below, we provided the evaluation gin configuration files for the benchmarks we have considered:

frame_interpolation/eval/
├── config/
│   ├── middlebury.gin
│   ├── ucf101.gin
│   ├── vimeo_90K.gin
│   ├── xiph_2K.gin
│   ├── xiph_4K.gin

To run an evaluation, simply pass the configuration file of the desired evaluation dataset.
If a GPU is visible, it runs on it.

> python3 -m frame_interpolation.eval.eval_cli -- \
     --gin_config frame_interpolation/eval/config/
   
    .gin \
     --model_path 
    
     /film_net/L1/saved_model

    
   

The above command will produce the PSNR and SSIM scores presented in the paper.

Citation

If you find this implementation useful in your works, please acknowledge it appropriately by citing:

@inproceedings{reda2022film,
 title = {Frame Interpolation for Large Motion},
 author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
 booktitle = {arXiv},
 year = {2022}
}
@misc{film-tf,
  title = {Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Scene Motion"},
  author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/google-research/frame-interpolation}}
}

Contact: Fitsum Reda ([email protected])

Acknowledgments

We would like to thank Richard Tucker, Jason Lai and David Minnen. We would also like to thank Jamie Aspinall for the imagery included in this repository.

Coding style

  • 2 spaces for indentation
  • 80 character line length
  • PEP8 formatting

Disclaimer

This is not an officially supported Google product.

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 92 Jan 04, 2023
Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

Does-MAML-Only-Work-via-Feature-Re-use-A-Data-Set-Centric-Perspective Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective Installin

2 Nov 07, 2022
NumPy로 구현한 딥러닝 라이브러리입니다. (자동 미분 지원)

Deep Learning Library only using NumPy 본 레포지토리는 NumPy 만으로 구현한 딥러닝 라이브러리입니다. 자동 미분이 구현되어 있습니다. 자동 미분 자동 미분은 미분을 자동으로 계산해주는 기능입니다. 아래 코드는 자동 미분을 활용해 역전파

조준희 17 Aug 16, 2022
Lightweight mmm - Lightweight (Bayesian) Media Mix Model

Lightweight (Bayesian) Media Mix Model This is not an official Google product. L

Google 342 Jan 03, 2023
ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
Let Python optimize the best stop loss and take profits for your TradingView strategy.

TradingView Machine Learning TradeView is a free and open source Trading View bot written in Python. It is designed to support all major exchanges. It

Robert Roman 473 Jan 09, 2023
YOLOX-RMPOLY

本算法为适应robomaster比赛,而改动自矩形识别的yolox算法。 基于旷视科技YOLOX,实现对不规则四边形的目标检测 TODO 修改onnx推理模型 更改/添加标注: 1.yolox/models/yolox_polyhead.py: 1.1继承yolox/models/yolo_

3 Feb 25, 2022
Capstone-Project-2 - A game program written in the Python language

Capstone-Project-2 My Pygame Game Information: Description This Pygame project i

Nhlakanipho Khulekani Hlophe 1 Jan 04, 2022
Atomistic Line Graph Neural Network

Table of Contents Introduction Installation Examples Pre-trained models Quick start using colab JARVIS-ALIGNN webapp Peformances on a few datasets Use

National Institute of Standards and Technology 91 Dec 30, 2022
The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks This folder contains the code to reproduce the data in "The Implicit Bias o

Samuel Lippl 0 Feb 05, 2022
PyTorch and Tensorflow functional model definitions

functional-zoo Model definitions and pretrained weights for PyTorch and Tensorflow PyTorch, unlike lua torch, has autograd in it's core, so using modu

Sergey Zagoruyko 590 Dec 22, 2022
Code for the paper Language as a Cognitive Tool to Imagine Goals in Curiosity Driven Exploration

IMAGINE: Language as a Cognitive Tool to Imagine Goals in Curiosity Driven Exploration This repo contains the code base of the paper Language as a Cog

Flowers Team 26 Dec 22, 2022
LBBA-boosted WSOD

LBBA-boosted WSOD Summary Our code is based on ruotianluo/pytorch-faster-rcnn and WSCDN Sincerely thanks for your resources. Newer version of our code

Martin Dong 20 Sep 19, 2022
Learning Skeletal Articulations with Neural Blend Shapes

This repository provides an end-to-end library for automatic character rigging and blend shapes generation as well as a visualization tool. It is based on our work Learning Skeletal Articulations wit

Peizhuo 504 Dec 30, 2022
Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021) Paper Video Instance Segmentation using Inter-Frame Communicat

Sukjun Hwang 81 Dec 29, 2022
Implementation of the Swin Transformer in PyTorch.

Swin Transformer - PyTorch Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer,

597 Jan 03, 2023
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
Transformer model implemented with Pytorch

transformer-pytorch Transformer model implemented with Pytorch Attention is all you need-[Paper] Architecture Self-Attention self_attention.py class

Mingu Kang 12 Sep 03, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

63 Nov 18, 2022
Compares various time-series feature sets on computational performance, within-set structure, and between-set relationships.

feature-set-comp Compares various time-series feature sets on computational performance, within-set structure, and between-set relationships. Reposito

Trent Henderson 7 May 25, 2022