RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Overview

RIFE - Real Time Video Interpolation

arXiv | YouTube | Colab | Tutorial | Demo

Table of Contents

  1. Introduction
  2. Collection
  3. Usage
  4. Evaluation
  5. Training and Reproduction
  6. Citation
  7. Reference
  8. Sponsor

Introduction

This project is the implement of RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation. If you are a developer, welcome to follow Practical-RIFE, which aims to make RIFE more practical for users by adding various features and design new models.

Currently, our model can run 30+FPS for 2X 720p interpolation on a 2080Ti GPU. It supports 2X,4X,8X... interpolation, and multi-frame interpolation between a pair of images.

16X interpolation results from two input images:

Demo Demo

Software

Squirrel-RIFE(中文软件) | Waifu2x-Extension-GUI | Flowframes | RIFE-ncnn-vulkan | RIFE-App(Paid) | Autodesk Flame | SVP |

CLI Usage

Installation

git clone [email protected]:hzwer/arXiv2020-RIFE.git
cd arXiv2020-RIFE
pip3 install -r requirements.txt

Run

Video Frame Interpolation

You can use our demo video or your own video.

python3 inference_video.py --exp=1 --video=video.mp4 

(generate video_2X_xxfps.mp4)

python3 inference_video.py --exp=2 --video=video.mp4

(for 4X interpolation)

python3 inference_video.py --exp=1 --video=video.mp4 --scale=0.5

(If your video has very high resolution such as 4K, we recommend set --scale=0.5 (default 1.0). If you generate disordered pattern on your videos, try set --scale=2.0. This parameter control the process resolution for optical flow model.)

python3 inference_video.py --exp=2 --img=input/

(to read video from pngs, like input/0.png ... input/612.png, ensure that the png names are numbers)

python3 inference_video.py --exp=2 --video=video.mp4 --fps=60

(add slomo effect, the audio will be removed)

python3 inference_video.py --video=video.mp4 --montage --png

(if you want to montage the origin video, skip static frames and save the png format output)

The warning info, 'Warning: Your video has *** static frames, it may change the duration of the generated video.' means that your video has changed the frame rate by adding static frames, it is common if you have processed 25FPS video to 30FPS.

Image Interpolation

python3 inference_img.py --img img0.png img1.png --exp=4

(2^4=16X interpolation results) After that, you can use pngs to generate mp4:

ffmpeg -r 10 -f image2 -i output/img%d.png -s 448x256 -c:v libx264 -pix_fmt yuv420p output/slomo.mp4 -q:v 0 -q:a 0

You can also use pngs to generate gif:

ffmpeg -r 10 -f image2 -i output/img%d.png -s 448x256 -vf "split[s0][s1];[s0]palettegen=stats_mode=single[p];[s1][p]paletteuse=new=1" output/slomo.gif

Run in docker

Place the pre-trained models in train_log/\*.pkl (as above)

Building the container:

docker build -t rife -f docker/Dockerfile .

Running the container:

docker run --rm -it -v $PWD:/host rife:latest inference_video --exp=1 --video=untitled.mp4 --output=untitled_rife.mp4
docker run --rm -it -v $PWD:/host rife:latest inference_img --img img0.png img1.png --exp=4

Using gpu acceleration (requires proper gpu drivers for docker):

docker run --rm -it --gpus all -v /dev/dri:/dev/dri -v $PWD:/host rife:latest inference_video --exp=1 --video=untitled.mp4 --output=untitled_rife.mp4

Evaluation

Download RIFE model reported by our paper.

UCF101: Download UCF101 dataset at ./UCF101/ucf101_interp_ours/

Vimeo90K: Download Vimeo90K dataset at ./vimeo_interp_test

MiddleBury: Download MiddleBury OTHER dataset at ./other-data and ./other-gt-interp

HD: Download HD dataset at ./HD_dataset. We also provide a google drive download link.

# RIFE
python3 benchmark/UCF101.py
# "PSNR: 35.282 SSIM: 0.9688"
python3 benchmark/Vimeo90K.py
# "PSNR: 35.615 SSIM: 0.9779"
python3 benchmark/MiddleBury_Other.py
# "IE: 1.956"
python3 benchmark/HD.py
# "PSNR: 32.14"
python3 benchmark/HD_multi.py
# "PSNR: 18.60(544*1280), 29.02(720p), 24.73(1080p)"

Training and Reproduction

Download Vimeo90K dataset.

We use 16 CPUs, 4 GPUs and 20G memory for training:

python3 -m torch.distributed.launch --nproc_per_node=4 train.py --world_size=4

Citation

@article{huang2020rife,
  title={RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation},
  author={Huang, Zhewei and Zhang, Tianyuan and Heng, Wen and Shi, Boxin and Zhou, Shuchang},
  journal={arXiv preprint arXiv:2011.06294},
  year={2020}
}

Reference

Optical Flow: ARFlow pytorch-liteflownet RAFT pytorch-PWCNet

Video Interpolation: DVF TOflow SepConv DAIN CAIN MEMC-Net SoftSplat BMBC EDSC EQVI

Sponsor

感谢支持 Paypal Sponsor: https://www.paypal.com/paypalme/hzwer

imageimage

Comments
  • Welcome to try v3.8 model

    Welcome to try v3.8 model

    Based on the evaluation of dozens of videos, the v3.8 model has achieved an acceleration effect of more than 2X while surpassing the effect of the RIFEv2.4 model. And v3.8 can better handle 2d scenes. At the same time, we welcome you to submit bad cases to help us in the future model improvement.

    v3.8 model: https://github.com/hzwer/Practical-RIFE#model-list

    opened by hzwer 23
  • 24 to 60 fps?

    24 to 60 fps?

    Hi there,

    RIFE looks fantastic, but as fair as I know I only can enter integer numbers as scale factor, correct? So when I want to interpolate 24 fps to 60 (by far the most common case I suppose) I know no other way than interpolating to 120 (factor 5) and then drop any other frame to get 60.

    But even that doesn't seem to be possible as supported scale factors are only 2x, 4x, 8x (no 5x option).

    So, is RIFE able to make 24 fps movie content run smooth on 60 Hz displays?

    opened by spyro2000 22
  • can't train because torch incompatible with python version

    can't train because torch incompatible with python version

    /home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
    and will be removed in future. Use torch.distributed.run.
    Note that --use_env is set by default in torch.distributed.run.
    If your script expects `--local_rank` argument to be set, please
    change it to read from `os.environ['LOCAL_RANK']` instead. See 
    https://pytorch.org/docs/stable/distributed.html#launch-utility for 
    further instructions
    
      warnings.warn(
    WARNING:torch.distributed.run:*****************************************
    Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
    *****************************************
    Traceback (most recent call last):
    Traceback (most recent call last):
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
    Traceback (most recent call last):
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch._C._cuda_setDevice(device)
        RuntimeErrortorch._C._cuda_setDevice(device): 
    CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    RuntimeError: CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
        torch._C._cuda_setDevice(device)
    RuntimeError: CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 1 (pid: 651166) of binary: /usr/bin/python3
    Traceback (most recent call last):
      File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 193, in <module>
        main()
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 189, in main
        launch(args)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 174, in launch
        run(args)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 689, in run
        elastic_launch(
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
        return launch_agent(self._config, self._entrypoint, list(args))
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
        raise ChildFailedError(
    torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
    ***************************************
                train.py FAILED            
    =======================================
    Root Cause:
    [0]:
      time: 2021-09-30_17:35:27
      rank: 1 (local_rank: 1)
      exitcode: 1 (pid: 651166)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    =======================================
    Other Failures:
    [1]:
      time: 2021-09-30_17:35:27
      rank: 2 (local_rank: 2)
      exitcode: 1 (pid: 651167)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    [2]:
      time: 2021-09-30_17:35:27
      rank: 3 (local_rank: 3)
      exitcode: 1 (pid: 651168)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    ***************************************
    
    
    opened by arch-user-france1 19
  • Transparent PNG support

    Transparent PNG support

    Seeing that recently EXR support was added, is it possible to support transparency (alpha channel) for PNG input and output (using --img --png) for inference_video.py?

    This would enable interpolation of transparent GIFs.

    opened by n00mkrad 19
  • Image sequence and input

    Image sequence and input

         Thanks for adding the png output function. Can you make the output name to be consistent with ffmpeg ? i.e. 0000.png 0001.png ----- 7821.png.And then we can use ffmpeg to deal with image sequence.
         Adding image sequence input would also be great.
    
    opened by Michaelwhite34 15
  • 问题: Dataset有multiframes的时候,该如何prepare

    问题: Dataset有multiframes的时候,该如何prepare

    你好,首先非常感谢在github上共享这个repo!

    在用您release的model运行inference之后,我想尝试用customer dataset来训练。

    我的dataset每一个video里面有24个frames, 所以目标就是生成中间的22个frames.

    我参考了 一下在dataset.py 中的 VimeoDataset class, 发现在prepare的data的时候, 因为这个dataset每个video只有3个frames,所以return的都是 第一个和最后一个frame,要求interpolate的是中间的frame.

    • 想请问一下如果我想interpolate多个frames,是有可能实现的吗?

    目前我已经开始训练了,我大概做了一个稍微的调整,就是input是第一个和最后一个frame, 要求predict中间的那个frame (第十二个frame),然后Model的选择,我选的是 RIFE.pyIFNet.py,对应的生成Model应该是最robust的(42.9MB). 因为我的数据集比较单一,为了防止overfitting,我先预先load了你们release的Model, 然后继续训练。

    • 但是我发现loss在几十个epoch之后,出现了井喷的状态,最后生成的Model在做inference的时候,完全生成不了中间的22个frames(全部都是黑图), 跟我一开始用您release的Model运行的结果相差甚远...

    后来我尝试用另一个Model训练(RIFE_HDV3.pyIFNet_HDV3.py)(生成Model是12.2MB), 但是pytorch一直报错。

    -RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argumentfind_unused_parameters=Truetotorch.nn.parallel.DistributedDataParallel; (2) making sure allforwardfunction outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module'sforwardfunction. Please include the loss function and the structure of the return value offorwardof your module when reporting this issue (e.g. list, dict, iterable). 错误源头是在RIFE_HDV3.py中的 update()里的 flow, mask, merged = self.flownet(imgs, scale=[4,2,1]) 我查了一下,出现这个错误原因是因为在 forward()的output里有些variables没有用来calculate loss. 我又去仔仔细细的查看了一下IFNet_HDV3下的forward() , 还是无果..

    如果您有好的建议的话,不甚感谢!

    opened by chenyuZha 13
  • Not the fastest for multi-frame interpolation

    Not the fastest for multi-frame interpolation

    Hi,

    Thanks for open sourcing the code and contributing to the video frame interpolation community.

    In the paper, it mentioned: "Coupled with the large complexity in the bi-directional flow estimation, none of these methods can achieve real-time speed"

    I believe that might be inappropriate to say, as the recent published paper (https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123720103.pdf) targets efficient multi-frame interpolation.

    It utilizes bi-directional flow estimation as well, but it generates 7 frames for 0.12 second. where your method requires 0.036 * 7 = 0.252 seconds.

    And the model from that paper is compact, which consists of only ~2M parameters, where your fast model has ~10M parameters.

    opened by mrchizx 13
  • 关于hdv2和hdv3模型的复现

    关于hdv2和hdv3模型的复现

    您好, 我复现了hdv2和hdv3模型,但是和您提供的结果总有一些差距。 配置超参数: weight_decay=1e-4, learing rate:3e-4 *mul ,vgg loss,还有 #146 中提到的数据增广方式,patch size 256,训练300epoch。 我看您在hdv2和v3提供的版本中,模型结构都没有变化,它们都有什么区别呢?除了我上面的配置,想要达到您的效果还有哪些没有做到呢?

    opened by tqyunwuxin 12
  • Model v2 update log

    Model v2 update log

    We show some hard case results for every version model. v2 google drive download link: (https://drive.google.com/file/d/1wsQIhHZ3Eg4_AfCXItFKqqyDMB4NS0Yd/view).

    v1.1 2020.11.16 链接:https://pan.baidu.com/s/1SPRw_u3zjaufn7egMr19Eg 密码:orkd imageimage

    opened by hzwer 12
  • Training with other datasets

    Training with other datasets

    Has anyone trained RIFE_HDv2 with training set other than vimeo dataset: such as HD dataset. I

    And were they able to get better visual quality for HD content.

    opened by rsjjdesj 11
  • replicating benchmarks

    replicating benchmarks

    Thank you for sharing your code! I was trying to replicate the numbers you stated in your paper using this implementation but have unfortunately been unsuccessful so far. Would you be able to share a script that can be used to replicate the Vimeo-90k metrics you quoted? Also, I think the following padding has some issues.

    https://github.com/hzwer/arXiv2020-RIFE/blob/3194107170d6613b2ea924aa35bb57e5913fff44/inference_img.py#L26-L28

    https://github.com/hzwer/arXiv2020-RIFE/blob/3194107170d6613b2ea924aa35bb57e5913fff44/inference_img.py#L45

    The pw - w and [:h, :w] indicate that pw > w (and ph > h). However, pw = 340 // 32 * 32 = 320 for w = 340 which violates this condition. Thanks for looking into this and thanks again for sharing your code!

    opened by sniklaus 11
  • Reproducibility results

    Reproducibility results

    Hi,

    I checked if I can reproduce the results similar to those in the paper to make sure I am training the model properly. These are the results that I got on Vimeo triplets:

    interpol_flow

    The model prediction is shown for t=1 (predicting between t=0 and t=2), the second row corresponds to interpolation ("Interpol"), and the last row to flow ("Flow pred"). I see that interpolation results are very good, however, I expected the flow to be a bit more accurate. In section 6.2 of the appendix you mention that "IFNet produces clear motion boundaries.", this is also what can be seen in Figure 10. Therefore I wanted to ask if there were any other training steps that I need to add to get flow prediction more accurate. I can of course share more prediction examples that I got.

    The training was done for 300 epochs using the reconstruction losses and distillation loss (the latter is with coeff. 0.01) as described in the paper, I didn't change anything in the code to train and obtain these results. This is the loss plot: val_loss_300

    I would appreciate if you can confirm that you trained the model the same way and that your flow predictions look similar. I used IFNet for training (self.flownet = IFNet()).

    opened by HamidGadirov 4
  • How to visualize flow that model inferenced ?

    How to visualize flow that model inferenced ?

    flow, mask, merged = self.flownet(imgs, scale_list) flow must be the flow between two images. it's shape (bs, 4, H, W) , how to visualize it ? like this: image and how to generate the flow groudtruth ? Thx !

    opened by zhishao 12
  • RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

    RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

    hi, im trying to interpolate a png sequence. they are properly numbered and such. here is my console:

    conda run -n RIFE py D:\Development\RIFE\inference_video.py --img "D:\Game Assets\Super Outrun Rush\Animations\Pixelated\WadeCharge" --exp=4
    Loaded v3.x HD model.
    
      0%|          | 0/16 [00:00<?, ?it/s]Traceback (most recent call last):
      File "D:\Development\RIFE\inference_video.py", line 259, in <module>
        output = make_inference(I0, I1, 2**args.exp-1) if args.exp else []
      File "D:\Development\RIFE\inference_video.py", line 180, in make_inference
        middle = model.inference(I0, I1, args.scale)
      File "D:\Development\RIFE\train_log\RIFE_HDv3.py", line 58, in inference
        flow, mask, merged = self.flownet(imgs, scale_list)
      File "C:\Users\Jackson\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "D:\Development\RIFE\train_log\IFNet_HDv3.py", line 113, in forward
        merged[i] = merged[i][0] * mask_list[i] + merged[i][1] * (1 - mask_list[i])
    RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1
    
    opened by Apple-Fritter-Money-Entertainment 0
Releases(arxiv_v5_code)
Owner
hzwer
hzwer
The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding"

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
这是一个yolox-keras的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Keras当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤 Ho

Bubbliiiing 64 Nov 10, 2022
Laplacian Score-regularized Concrete Autoencoders

Laplacian Score-regularized Concrete Autoencoders Requirements: torch = 1.9 scikit-learn = 0.24 omegaconf = 2.0.6 scipy = 1.6.0 matplotlib How to

JS 6 Dec 07, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
Fast and robust certifiable relative pose estimation

Fast and Robust Relative Pose Estimation for Calibrated Cameras This repository contains the code for the relative pose estimation between two central

42 Dec 06, 2022
Codes for "CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation"

CSDI This is the github repository for the NeurIPS 2021 paper "CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation

106 Jan 04, 2023
Using python and scikit-learn to make stock predictions

MachineLearningStocks in python: a starter project and guide EDIT as of Feb 2021: MachineLearningStocks is no longer actively maintained MachineLearni

Robert Martin 1.3k Dec 29, 2022
Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

ML² AT CILVR 18 Nov 18, 2022
Videocaptioning.pytorch - A simple implementation of video captioning

pytorch implementation of video captioning recommend installing pytorch and pyth

Yiyu Wang 2 Jan 01, 2022
Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)

Install first pip3 install -e . Training python3 training/unsupervised_tuning.py python3 training/supervised_tuning.py python3 training/multilingual_

yanzhang_nlp 26 Jul 22, 2022
Self-Supervised Deep Blind Video Super-Resolution

Self-Blind-VSR Paper | Discussion Self-Supervised Deep Blind Video Super-Resolution By Haoran Bai and Jinshan Pan Abstract Existing deep learning-base

Haoran Bai 35 Dec 09, 2022
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022
DISTIL: Deep dIverSified inTeractIve Learning.

DISTIL: Deep dIverSified inTeractIve Learning. An active/inter-active learning library built on py-torch for reducing labeling costs.

decile-team 110 Dec 06, 2022
pq is a jq-like Pickle file viewer

pq PQ is a jq-like viewer/processing tool for pickle files. howto # pq '' file.pkl {'other': 456, 'test': 123} # pq 'table' file.pkl |other|test| | 45

3 Mar 15, 2022
Accelerated SMPL operation, commonly used in generate 3D human mesh, STAR included.

SMPL2 An enchanced and accelerated SMPL operation which commonly used in 3D human mesh generation. It takes a poses, shapes, cam_trans as inputs, outp

JinTian 20 Oct 17, 2022
Change Detection in SAR Images Based on Multiscale Capsule Network

SAR_CD_MS_CapsNet Code for the paper "Change Detection in SAR Images Based on Multiscale Capsule Network" , IEEE Geoscience and Remote Sensing Letters

Feng Gao 21 Nov 29, 2022
PAWS 🐾 Predicting View-Assignments with Support Samples

This repo provides a PyTorch implementation of PAWS (predicting view assignments with support samples), as described in the paper Semi-Supervised Learning of Visual Features by Non-Parametrically Pre

Facebook Research 437 Dec 23, 2022
PyTorch implementation for paper Neural Marching Cubes.

NMC PyTorch implementation for paper Neural Marching Cubes, Zhiqin Chen, Hao Zhang. Paper | Supplementary Material (to be updated) Citation If you fin

Zhiqin Chen 109 Dec 27, 2022
Least Square Calibration for Peer Reviews

Least Square Calibration for Peer Reviews Requirements gurobipy - for solving convex programs GPy - for Bayesian baseline numpy pandas To generate p

Sigma <a href=[email protected]"> 1 Nov 01, 2021