AI pipelines for Nvidia Jetson Platform

Last update: Dec 23, 2022

Overview

Jetson Multicamera Pipelines

Easy-to-use realtime CV/AI pipelines for Nvidia Jetson Platform. This project:

Builds a typical multi-camera pipeline, i.e. N×(capture)->preprocess->batch->DNN-> <<your application logic here>> ->encode->file I/O + display. Uses gstreamer and deepstream under-the-hood.
Gives programatic acces to configure the pipeline in python via jetmulticam package.
Utilizes Nvidia HW accleration for minimal CPU usage. For example, you can perform object detection in real-time on 6 camera streams using as little as 16.5% CPU. See benchmarks below for details.

Demos

You can easily build your custom logic in python by accessing image data (via np.array), as well object detection results. See examples of person following below:

DashCamNet (DLA0) + PeopleNet (DLA1) on 3 camera streams.

We have 3 intependent cameras with ~270° field of view. Red Boxes correspond to DashCamNet detections, green ones to PeopleNet. The PeopleNet detections are used to perform person following logic.

demo_8_follow_me.mp4

PeopleNet (GPU) on 3 cameras streams.

Robot is operated in manual mode.

demo_9_security_nvidia.mp4

DashCamNet (GPU) on 3 camera streams.

Robot is operated in manual mode.

demo_1_fedex_driver.mp4

(All demos are performed in real-time onboard Nvidia Jetson Xavier NX)

Quickstart

Install:

git clone https://github.com/NVIDIA-AI-IOT/jetson-multicamera-pipelines.git
cd jetson-multicamera-pipelines
bash scripts/install-dependencies.sh
pip3 install .

Run example with your cameras:

source scripts/env_vars.sh 
cd examples
python3 example.py

Usage example

import time
from jetmulticam import CameraPipelineDNN
from jetmulticam.models import PeopleNet, DashCamNet

if __name__ == "__main__":

    pipeline = CameraPipelineDNN(
        cameras=[2, 5, 8],
        models=[
            PeopleNet.DLA1,
            DashCamNet.DLA0,
            # PeopleNet.GPU
        ],
        save_video=True,
        save_video_folder="/home/nx/logs/videos",
        display=True,
    )

    while pipeline.running():
        arr = pipeline.images[0] # np.array with shape (1080, 1920, 3), i.e. (1080p RGB image)
        dets = pipeline.detections[0] # Detections from the DNNs
        time.sleep(1/30)

Benchmarks

#	Scenario	# cams	CPU util. (jetmulticam)	CPU util. (nvargus-deamon)	CPU total	GPU %	EMC util %	Power draw	Inference Hardware
1.	1xGMSL -> 2xDNNs + disp + encode	1	5.3%	4%	9.3%	<3%	57%	8.5W	DLA0: PeopleNet DLA1: DashCamNet
2.	2xGMSL -> 2xDNNs + disp + encode	2	7.2%	7.7%	14.9%	<3%	62%	9.4W	DLA0: PeopleNet DLA1: DashCamNet
3.	3xGMSL -> 2xDNNs + disp + encode	3	9.2%	11.3%	20.5%	<3%	68%	10.1W	DLA0: PeopleNet DLA1: DashCamNet
4.	Same as #3 with CPU @ 1.9GHz	3	7.5%	9.0%		<3%	68%	10.4w	DLA0: PeopleNet DLA1: DashCamNet
5.	3xGMSL+2xV4L -> 2xDNNs + disp + encode	5	9.5%	11.3%	20.8%	<3%	45%	9.1W	DLA0: PeopleNet (interval=1) DLA1: DashCamNet (interval=1)
6.	3xGMSL+2xV4L -> 2xDNNs + disp + encode	5	8.3%	11.3%	19.6%	<3%	25%	7.5W	DLA0: PeopleNet (interval=6) DLA1: DashCamNet (interval=6)
7.	3xGMSL -> DNN + disp + encode	5	10.3%	12.8%	23.1%	99%	25%	15W	GPU: PeopleNet

Notes:

All figures are in 15W 6 core mode. To reproduce do: sudo nvpmodel -m 2; sudo jetson_clocks;
Test platform: Jetson Xavier NX and XNX Box running JetPack v4.5.1
The residual GPU usage in DLA-accelerated nets is caused by Sigmoid activations being computed with CUDA backend. Remaining layers are computed on DLA.
CPU usage will vary depending on factors such as camera resolution, framerate, available video formats and driver implementation.

Supported models / acceleratorss

pipeline = CameraPipelineDNN(
    cam_ids = [0, 1, 2]
    models=[
        models.PeopleNet.DLA0,
        models.PeopleNet.DLA1,
        models.PeopleNet.GPU,
        models.DashCamNet.DLA0,
        models.DashCamNet.DLA1,
        models.DashCamNet.GPU
        ]
    # ...
)

AI pipelines for Nvidia Jetson Platform

Related tags

Overview

Jetson Multicamera Pipelines

Demos

DashCamNet (DLA0) + PeopleNet (DLA1) on 3 camera streams.

PeopleNet (GPU) on 3 cameras streams.

DashCamNet (GPU) on 3 camera streams.

Quickstart

Usage example

Benchmarks

More

Supported models / acceleratorss

Owner

NVIDIA AI IOT

基于DouZero定制AI实战欢乐斗地主

SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

Synthetic Humans for Action Recognition, IJCV 2021

基于pytorch构建cyclegan示例

A simple but complete full-attention transformer with a set of promising experimental features from various papers

RepVGG: Making VGG-style ConvNets Great Again

SPEAR: Semi suPErvised dAta progRamming

Semi-SDP Semi-supervised parser for semantic dependency parsing.

An Open-Source Tool for Automatic Disease Diagnosis..

Exploring Simple Siamese Representation Learning

A facial recognition doorbell system using a Raspberry Pi

New approach to benchmark VQA models

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

A testcase generation tool for Persistent Memory Programs.

Location-Sensitive Visual Recognition with Cross-IOU Loss

Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.