A Pythonic library for Nvidia Codec.

The project is still in active development; expect breaking changes.

Why another Python library for Nvidia Codec?

Comparison to Video-Processing-Framework

Methodologies

VPF is written fully in C++ and uses pybind to expose Python interfaces. PNC is written fully in Python and uses ctypes to access Nvidia C interfaces. Our codes tends to be more concise, less duplicative and easier to read and write.

Performance

Preliminary tests shows little to no difference in terms of performance, because the heavy lifting is done on the GPU anyway. Both library can saturate GPU decoder. PNC uses more CPU than VPF as expected from Python vs. C++, but still negligible (less than 10% of Ryzen 3100 single core for 8K*4K HEVC)

Resource Management

In VPF Surface given to user are not owned by the user. It will be overwritten by new frames which is counter-intuitive; Picture are not exposed to user at all - they are always mapped (post-processed and copied) to Surface so the picture can be ready for new frames. The latter is inefficient when only a subset of Pictures are needed (e.g. screenshots).
The above is because VPF allocates the bare minimum of resources needed for most decoding tasks. PNC allows the user to specify the amount of resources to be allocated for advanced applications. Users own the resources and decide when and whether to deal with them.
Managing resources is not painful: similar to pycuda, we shift the burden of managing host/device resources to the Python garbage collector. Resources (such as Picture and Surface) are automatically freed when the user drops the reference.

Things to come

TODO Cropping and scaling support in postprocessing
TODO Color space conversion from YUV (bt. 601/709, full-range/limit-range) to RGB using pycuda
Encoder

Acknowledgements

Many thanks to @rarzumanyan for all the helps and explanations!

A Pythonic library for Nvidia Codec.

Related tags

Overview

A Pythonic library for Nvidia Codec.

Why another Python library for Nvidia Codec?

Things to come

Acknowledgements

Owner

Zesen Qian

Code for paper [ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot] (ICCV 2021, oral))

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

OpenDelta - An Open-Source Framework for Paramter Efficient Tuning.

PyTorch implementation of EigenGAN

StyleGAN2-ADA-training-jupyter - Training custom datasets in styleGAN2-ADA by NVIDIA using Jupyter

Lightwood is Legos for Machine Learning.

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Generic Foreground Segmentation in Images

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

SimDeblur is a simple framework for image and video deblurring, implemented by PyTorch

A tight inclusion function for continuous collision detection

A Python framework for conversational search

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

CATE: Computation-aware Neural Architecture Encoding with Transformers

A fuzzing framework for SMT solvers

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

A chemical analysis of lipophilicities & molecule drawings including ML