A pytorch-based real-time segmentation model for autonomous driving

Last update: Dec 22, 2022

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

This project contains the Pytorch implementation for the proposed CFPNet: paper

Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features. Experiments on Cityscapes and CamVid datasets show that the proposed CFPNet achieves an effective combination of those factors. For the Cityscapes test dataset, CFPNet achievse 70.1% class-wise mIoU with only 0.55 million parameters and 2.5 MB memory. The inference speed can reach 30 FPS on a single RTX 2080Ti GPU (GPU usage 60%) with a 1024×2048-pixel image.

Installation

Enviroment: Python 3.6; Pytorch 1.0; CUDA 9.0; cuDNN V7
Install some packages:

pip install opencv-python pillow numpy matplotlib

Clone this repository

git clone https://github.com/AngeLouCN/CFPNet

One GPU with 11GB memory is needed

Dataset

You need to download the two dataset——CamVid and Cityscapes, and put the files in the datasetfolder with following structure.

|—— camvid
|    ├── train
|    ├── test
|    ├── val 
|    ├── trainannot
|    ├── testannot
|    ├── valannot
|    ├── camvid_trainval_list.txt
|    ├── camvid_train_list.txt
|    ├── camvid_test_list.txt
|    └── camvid_val_list.txt
├── cityscapes
|    ├── gtCoarse
|    ├── gtFine
|    ├── leftImg8bit
|    ├── cityscapes_trainval_list.txt
|    ├── cityscapes_train_list.txt
|    ├── cityscapes_test_list.txt
|    └── cityscapes_val_list.txt

Training

You can run: python train.py -hto check the detail of optional arguments. In the train.py, you can set the dataset, train type, epochs and batch size, etc.
training on Cityscapes train set.

python train.py --dataset cityscapes

training on Camvid train and val set.

python train.py --dataset camvid --train_type trainval --max_epochs 1000 --lr 1e-3 --batch_size 16

During training course, every 50 epochs, we will record the mean IoU of train set, validation set and training loss to draw a plot, so you can check whether the training process is normal.

Val mIoU vs Epochs	Train loss vs Epochs

Testing

After training, the checkpoint will be saved at checkpointfolder, you can use test.pyto predict the result.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Evalution

For those dataset that do not provide label on the test set (e.g. Cityscapes), you can use predict.py to save all the output images, then submit to official webpage for evaluation.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Inference Speed

You can run the eval_fps.py to test the model inference speed, input the image size such as 1024,2048.

python eval_fps.py 1024,2048

Results

Results for CFPNet-V1, CFPNet-V2 and CFPNet-v3:

Dataset	Model	mIoU
Cityscapes	CFPNet-V1	60.4%
Cityscapes	CFPNet-V2	66.5%
Cityscapes	CFPNet-V3	70.1%

Sample results: (from top to bottom is Original, CFPNet-V1, CFPNet-V2 and CFPNet-v3)

Category_acc vs size	Class_acc vs size

Class_acc vs parameter	Class_acc vs speed

Comparsion

Results of Cityscapes

Results of CamVid

Citation

If you think our work is helpful, please consider to cite:

@article{lou2021cfpnet,
  title={CFPNet: Channel-wise Feature Pyramid for Real-Time Semantic Segmentation},
  author={Lou, Ange and Loew, Murray},
  journal={arXiv preprint arXiv:2103.12212},
  year={2021}
}

A pytorch-based real-time segmentation model for autonomous driving

Related tags

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

Installation

Dataset

Training

Testing

Evalution

Inference Speed

Results

Comparsion

Citation

Owner

Code for Paper: Self-supervised Learning of Motion Capture

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

LAnguage Model Analysis

A self-supervised learning framework for audio-visual speech

Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, numpy and joblib packages.

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

A task Provided by A respective Artenal Ai and Ml based Company to complete it

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

Parameterized Explainer for Graph Neural Network

A curated list of awesome Model-Based RL resources

Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Reproducing code of hair style replacement method from Barbershorp.

RealFormer-Pytorch Implementation of RealFormer using pytorch

Image data augmentation scheduler for albumentations transforms

A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Code Repository for The Kaggle Book, Published by Packt Publishing

Research code for Arxiv paper "Camera Motion Agnostic 3D Human Pose Estimation"