Serving PyTorch 1.0 Models as a Web Server in C++

Last update: Jan 04, 2023

Related tags

Overview

Serving PyTorch Models in C++

This repository contains various examples to perform inference using PyTorch C++ API.
Run git clone https://github.com/Wizaron/pytorch-cpp-inference in order to clone this repository.

Environment

Dockerfiles can be found at docker directory. There are two dockerfiles; one for cpu and the other for cuda10. In order to build docker image, you should go to docker/cpu or docker/cuda10 directory and run docker build -t <docker-image-name> ..
After creation of the docker image, you should create a docker container via docker run -v <directory-that-this-repository-resides>:<target-directory-in-docker-container> -p 8181:8181 -it <docker-image-name> (We will use 8181 to serve our PyTorch C++ model).
Inside docker container, go to the directory that this repository resides.
Download libtorch from PyTorch Website (CPU : https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.3.1%2Bcpu.zip - CUDA10 : https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.3.1.zip).
Unzip libtorch via unzip. This will create libtorch directory that contains torch shared libraries and headers.

Code Structure

models directory stores PyTorch models.
libtorch directory stores C++ torch headers and shared libraries to link the model against PyTorch.
utils directory stores various utility function to perform inference in C++.
inference-cpp directory stores codes to perform inference.

Exporting PyTorch ScriptModule

In order to export torch.jit.ScriptModule of ResNet18 to perform C++ inference, go to models/resnet directory and run python3 resnet.py. It will download pretrained ResNet18 model on ImageNet and create models/resnet_model_cpu.pth and (optionally) models/resnet_model_gpu.pth which we will use in C++ inference.

Serving the C++ Model

We can either serve the model as a single executable or as a web server.

Single Executable

In order to build a single executable for inference:
1. Go to inference-cpp/cnn-classification directory.
2. Run ./build.sh in order to build executable, named as predict.
3. Run the executable via ./predict <path-to-image> <path-to-exported-script-module> <path-to-labels-file> <gpu-flag{true/false}>.
4. Example: ./predict image.jpeg ../../models/resnet/resnet_model_cpu.pth ../../models/resnet/labels.txt false

Web Server

In order to build a web server for production:
1. Go to inference-cpp/cnn-classification/server directory.
2. Run ./build.sh in order to build web server, named as predict.
3. Run the binary via ./predict <path-to-exported-script-module> <path-to-labels-file> <gpu-flag{true/false}> (It will serve the model on http://localhost:8181/predict).
4. Example: ./predict ../../../models/resnet/resnet_model_cpu.pth ../../../models/resnet/labels.txt false
5. In order to make a request, open a new tab and run python test_api.py (It will make a request to localhost:8181/predict).

Serving PyTorch 1.0 Models as a Web Server in C++

Related tags

Overview

Serving PyTorch Models in C++

Environment

Code Structure

Exporting PyTorch ScriptModule

Serving the C++ Model

Single Executable

Web Server

Acknowledgement

Owner

Onur Kaplan

Disentangled Lifespan Face Synthesis

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning

Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"

Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).

Deep generative models of 3D grids for structure-based drug discovery

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

natural image generation using ConvNets

HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).

project page for VinVL

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

The author's officially unofficial PyTorch BigGAN implementation.

This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Motion .

[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021

A Lighting Pytorch Framework for Recommendation System, Easy-to-use and Easy-to-extend.

[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter