Deep Sketch-guided Cartoon Video Inbetweening

Last update: Dec 22, 2022

Related tags

Overview

Cartoon Video Inbetweening

Paper | DOI | Video

The source code of Deep Sketch-guided Cartoon Video Inbetweening by Xiaoyu Li, Bo Zhang, Jing Liao, Pedro V. Sander, IEEE Transactions on Visualization and Computer Graphics, 2021.

Prerequisites

Linux or Windows
Python 3
CPU or NVIDIA GPU + CUDA CuDNN

Use the Pre-trained Models

You can download the pre-trained model here.

Run the following commands for evaluating the frame synthesis model and full model:

python eval_synthesis.py
python eval_full.py

The frame synthesis model takes img_0, img_1, ske_t as inputs and synthesizes img_t. The full model takes img_0, img_1, ske_t as inputs and interpolates five frames between img_0 and img_1.

Datasets

A dataset is a directory with the following structure:

dataset
    ├── frame
    │   └── ${clip_id}
    │       └──${image_id}.png
    ├── sketch
    │   └── ${clip_id}
    │       └──${image_id}.png
    └── dismap
        └── ${clip_id}
            └──${image_id}.npy

The sketch images can be generated by the script "sketch.py" and the distance maps can be generated by "dismap.py". Due to the copyright issue of the movie Spirited Away, we can not release our training dataset. You can generate your own dataset if you interest.

Training

Run the following command for training the frame synthesis model and full model:

python train_synthesis.py
python train_full.py

Before you train the full model, you must train the frame synthesis model first and use its parameters to initialize the full model.

Citing

If you find our work useful, please consider citing:

@article{li2021deep,
  author    = {Li, Xiaoyu and Zhang, Bo and Liao, Jing and Sander, Pedro},
  journal   = {IEEE Transactions on Visualization and Computer Graphics},
  year      = {2021},
  publisher = {IEEE}
}

Deep Sketch-guided Cartoon Video Inbetweening

Related tags

Overview

Cartoon Video Inbetweening

Paper | DOI | Video

Prerequisites

Use the Pre-trained Models

Datasets

Training

Citing

Owner

Xiaoyu Li

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

CompilerGym is a library of easy to use and performant reinforcement learning environments for compiler tasks

[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

Shared Attention for Multi-label Zero-shot Learning

Evaluating saliency methods on artificial data with different background types

Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

Second-order Attention Network for Single Image Super-resolution (CVPR-2019)

Implementation of StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation in PyTorch

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion