English | 中文

OpenIVA

OpenIVA is an end-to-end intelligent video analytics development toolkit based on different inference backends, designed to help individual users and start-ups quickly launch their own video AI services.
OpenIVA implements varied mainstream facial recognition, object detection, segmentation and landmark detection algorithms. And it provides an efficient and lightweight service deployment framework with a modular design. Users only need to replace the algorithm model used for their own tasks.

Features

Common mainstream algorithms

Provides latest fast accurate pre-trained models for facial recognition, object detection, segmentation and landmark detection tasks

Multi inference backends

Supports TensorlayerX/ TensorRT/ onnxruntime

High performance

Achieves high performance on CPU/GPU/Ascend platforms, achieve inference speed above 3000it/s

Asynchronous & multithreading

Use multithreading and queue to achieve high device utilization for inference and pre/post-processing

Lightweight service

Use Flask for lightweight intelligent application services

Modular design

You can quickly start your intelligent analysis service, only need to replace the AI models

GUI visualization tools

Start analysis tasks only by clicking buttons, and show visualized results in GUI windows, suitable for multiple tasks

Performance benchmark

Testing environments

i5-10400 6c12t
RTX3060
Ubuntu18.04
CUDA 11.1
TensorRT-7.2.3.4
onnxruntime with EPs:
- CPU(Default)
- CUDA(Manually Compiled)
- OpenVINO(Manually Compiled)
- TensorRT(Manually Compiled)

Performance

Facial recognition

Run
python test_landmark.py
batchsize=8, top_k=68, 67 faces in the image

Face detection
Model face_detector_640_dy_sim

onnxruntime EPs FPS faces per sec

CPU 32 2075

OpenVINO 81 5374

CUDA 105 7074

TensorRT(FP32) 124 7948

TensorRT(FP16) 128 8527
Face landmark
Model landmarks_68_pfld_dy_sim

onnxruntime EPs faces per sec

CPU 69

OpenVINO 890

CUDA 2061

TensorRT(FP32) 2639

TensorRT(FP16) 3131

onnxruntime EPs	FPS	faces per sec
CPU	32	2075
OpenVINO	81	5374
CUDA	105	7074
TensorRT(FP32)	124	7948
TensorRT(FP16)	128	8527

onnxruntime EPs	faces per sec
CPU	69
OpenVINO	890
CUDA	2061
TensorRT(FP32)	2639
TensorRT(FP16)	3131

Run
python test_face.py
batchsize=8

Face embedding
Model arc_mbv2_ccrop_sim

onnxruntime EPs faces per sec

CPU 212

OpenVINO 865

CUDA 1790

TensorRT(FP32) 2132

TensorRT(FP16) 2812

onnxruntime EPs	faces per sec
CPU	212
OpenVINO	865
CUDA	1790
TensorRT(FP32)	2132
TensorRT(FP16)	2812

Objects detection

Run
python test_yolo.py
batchsize=8 , 4 objects in the image

YOLOX objects detect
Model yolox_s(ms_coco)

onnxruntime EPs FPS Objects per sec

CPU 9.3 37.2

OpenVINO 13 52

CUDA 77 307

TensorRT(FP32) 95 380

TensorRT(FP16) 128 512

Model yolox_m(ms_coco)

onnxruntime EPs FPS Objects per sec

CPU 4 16

OpenVINO 5.5 22

CUDA 46.8 187

TensorRT(FP32) 64 259

TensorRT(FP16) 119 478

Model yolox_nano(ms_coco)

onnxruntime EPs FPS Objects per sec

CPU 47 188

OpenVINO 80 320

CUDA 210 842

TensorRT(FP32) 244 977

TensorRT(FP16) 269 1079

Model yolox_tiny(ms_coco)

onnxruntime EPs FPS Objects per sec

CPU 33 133

OpenVINO 43 175

CUDA 209 839

TensorRT(FP32) 248 995

TensorRT(FP16) 327 1310

Intelligent Video Analytics toolkit based on different inference backends.

Related tags

Overview

OpenIVA

Features

Performance benchmark

Testing environments

Performance

Facial recognition

Objects detection

Progress

Owner

Quantum Liu

Linear Variational State Space Filters

A mini lib that implements several useful functions binding to PyTorch in C++.

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Generic ecosystem for feature extraction from aerial and satellite imagery

Fair Recommendation in Two-Sided Platforms

Official implementation for "Image Quality Assessment using Contrastive Learning"

Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.

Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

CodeContests is a competitive programming dataset for machine-learning

CNNs for Sentence Classification in PyTorch

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

机器学习、深度学习、自然语言处理等人工智能基础知识总结。

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Source code for "FastBERT: a Self-distilling BERT with Adaptive Inference Time".

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

Attention for PyTorch with Linear Memory Footprint

onnxruntime EPs	FPS	Objects per sec
CPU	9.3	37.2
OpenVINO	13	52
CUDA	77	307
TensorRT(FP32)	95	380
TensorRT(FP16)	128	512

onnxruntime EPs	FPS	Objects per sec
CPU	4	16
OpenVINO	5.5	22
CUDA	46.8	187
TensorRT(FP32)	64	259
TensorRT(FP16)	119	478