Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

Last update: Dec 03, 2022

Related tags

Deep Learning DCVC

Overview

Introduction

Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

Prerequisites

Python 3.8 and conda, get Conda
CUDA 11.0

Environment

conda create -n $YOUR_PY38_ENV_NAME python=3.8
conda activate $YOUR_PY38_ENV_NAME

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
python -m pip install -r requirements.txt

Test dataset

Currenlty the spatial resolution of video needs to be cropped into the integral times of 64.

The dataset format can be seen in dataset_config_example.json.

For example, one video of HEVC Class B can be prepared as:

Crop the original YUV via ffmpeg:

ffmpeg -pix_fmt yuv420p  -s 1920x1080 -i  BasketballDrive_1920x1080_50.yuv -vf crop=1920:1024:0:0 BasketballDrive_1920x1024_50.yuv

Make the video path:
```
mkdir BasketballDrive_1920x1024_50
```

Convert YUV to PNG:

ffmpeg -pix_fmt yuv420p -s 1920x1024 -i BasketballDrive_1920x1024_50.yuv   -f image2 BasketballDrive_1920x1024_50/im%05d.png

At last, the folder structure of dataset is like:

/media/data/HEVC_B/
    * BQTerrace_1920x1024_60/
        - im00001.png
        - im00002.png
        - im00003.png
        - ...
    * BasketballDrive_1920x1024_50/
        - im00001.png
        - im00002.png
        - im00003.png
        - ...
    * ...
/media/data/HEVC_D
/media/data/HEVC_C/
...

Pretrained models

Download CompressAI models

cd checkpoints/
python download_compressai_models.py
cd ..

Download DCVC models and put them into /checkpoints folder.

Test DCVC

Example of test the PSNR model:

python test_video.py --i_frame_model_name cheng2020-anchor  --i_frame_model_path  checkpoints/cheng2020-anchor-3-e49be189.pth.tar  checkpoints/cheng2020-anchor-4-98b0b468.pth.tar   checkpoints/cheng2020-anchor-5-23852949.pth.tar   checkpoints/cheng2020-anchor-6-4c052b1a.pth.tar  --test_config     dataset_config_example.json  --cuda true --cuda_device 0,1,2,3   --worker 4   --output_json_result_path  DCVC_result_psnr.json    --model_type psnr  --recon_bin_path recon_bin_folder_psnr --model_path checkpoints/model_dcvc_quality_0_psnr.pth  checkpoints/model_dcvc_quality_1_psnr.pth checkpoints/model_dcvc_quality_2_psnr.pth checkpoints/model_dcvc_quality_3_psnr.pth

Example of test the MSSSIM model:

python test_video.py --i_frame_model_name bmshj2018-hyperprior  --i_frame_model_path  checkpoints/bmshj2018-hyperprior-ms-ssim-3-92dd7878.pth.tar checkpoints/bmshj2018-hyperprior-ms-ssim-4-4377354e.pth.tar    checkpoints/bmshj2018-hyperprior-ms-ssim-5-c34afc8d.pth.tar    checkpoints/bmshj2018-hyperprior-ms-ssim-6-3a6d8229.pth.tar   --test_config   dataset_config_example.json  --cuda true --cuda_device 0,1,2,3   --worker 4   --output_json_result_path  DCVC_result_msssim.json  --model_type msssim  --recon_bin_path recon_bin_folder_msssim --model_path checkpoints/model_dcvc_quality_0_msssim.pth checkpoints/model_dcvc_quality_1_msssim.pth checkpoints/model_dcvc_quality_2_msssim.pth checkpoints/model_dcvc_quality_3_msssim.pth

It is recommended that the --worker number is equal to your GPU number.

Acknowledgement

The implementation is based on CompressAI and PyTorchVideoCompression. The model weights of intra coding come from CompressAI.

Citation

If you find this work useful for your research, please cite:

@article{li2021deep,
  title={Deep Contextual Video Compression},
  author={Li, Jiahao and Li, Bin and Lu, Yan},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

Related tags

Overview

Introduction

Prerequisites

Test dataset

Pretrained models

Test DCVC

Acknowledgement

Citation

Owner

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".

Voice Conversion by CycleGAN (语音克隆/语音转换)：CycleGAN-VC3

A toolkit for making real world machine learning and data analysis applications in C++

This repository contains the code for the paper Neural RGB-D Surface Reconstruction

Scripts used to make and evaluate OpenAlex's concept tagging model

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow

Near-Duplicate Video Retrieval with Deep Metric Learning

Bringing Characters to Life with Computer Brains in Unity

ACV is a python library that provides explanations for any machine learning model or data.

Exe-to-xlsm - Simple script to create VBscript of exe and inject to xlsm

An auto discord account and token generator. Automatically verifies the phone number. Works without proxy. Bypasses captcha.

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

RP-GAN: Stable GAN Training with Random Projections

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Bridging Vision and Language Model