A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Last update: Dec 30, 2022

Overview

Real-time Instance Segmentation and Lane Detection

This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs), which is a simple, fully convolutional model developed by Daniel Bolya, Chong Zhou, Fanyi Xiao and Yong Jae Lee in 2019 (see repository https://github.com/dbolya/yolact). Here are the codes for their papers:

In order to use YOLACT++, make sure you compile the DCNv2 code. (See Installation)

Sample running

Installation

Clone this repository and enter it:

git clone https://github.com/jkd2021/YOLACT-with-lane-detection.git
cd YOLACT-with-lane-detection

Set up the environment using one of the following methods:
- Using Anaconda
  - Run conda env create -f environment.yml
- Manually with pip
  - Set up a Python3 environment (e.g., using virtenv).
  - Install Pytorch 1.0.1 (or higher) and TorchVision.
  - Install some other packages:
```
# Cython needs to be installed before pycocotools
pip install cython
pip install opencv-python pillow pycocotools matplotlib 
```
If you'd like to train YOLACT, download the COCO dataset and the 2014/2017 annotations. Note that this script will take a while and dump 21gb of files into ./data/coco.
```
sh data/scripts/COCO.sh
```
If you'd like to evaluate YOLACT on test-dev, download test-dev with this script.
```
sh data/scripts/COCO_test.sh
```
If you want to use YOLACT++, compile deformable convolutional layers (from DCNv2). Make sure you have the latest CUDA toolkit installed from NVidia's Website.
```
cd external/DCNv2
python setup.py build develop
```

Evaluation

See Evaluation in original YOLACT models https://github.com/dbolya/yolact#evaluation (released on April 5th, 2019).

To evalute the model, put the corresponding weights file in the ./weights directory and run one of the following commands with your own image and video. The name of each config is everything before the numbers in the file name (e.g., yolact_base for yolact_base_54_800000.pth).

Images

# Display qualitative results on the specified image.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --image=my_image.png

# Process an image and save it to another file.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --image=input_image.png:output_image.png

# Process a whole folder of images.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --images=path/to/input/folder:path/to/output/folder

Video

# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.
# If you want, use "--display_fps" to draw the FPS directly on the frame.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=my_video.mp4

# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=0

# Process a video and save it to another file. This uses the same pipeline as the ones above now, so it's fast!
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=input_video.mp4:output_video.mp4

# Process a video with higher frame rate and save it to another file.
python eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.3 --top_k=20 --video_multiframe=16 --display_fps --video=input_video.mp4:output_video.mp4

# Process a video with higher frame rate and display it
python eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.3 --top_k=20 --video_multiframe=16 --display_fps --video=input_video.mp4

As you can tell, eval.py can do a ton of stuff. Run the --help command to see everything it can do.

python eval.py --help

Training

see Training in original repository https://github.com/dbolya/yolact#training

Citation

If you use any code from here base in your work, please cite

@inproceedings{yolact-iccv2019,
  author    = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},
  title     = {YOLACT: {Real-time} Instance Segmentation},
  booktitle = {ICCV},
  year      = {2019},
}

For YOLACT++, please cite

@article{yolact-plus-tpami2020,
  author  = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title   = {YOLACT++: Better Real-time Instance Segmentation}, 
  year    = {2020},
}

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Related tags

Overview

Real-time Instance Segmentation and Lane Detection

Sample running

Installation

Evaluation

Images

Video

Training

Citation

Owner

Jin

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

PyTorch implementation of GLOM

Analyses of the individual electric field magnitudes with Roast.

lightweight python wrapper for vowpal wabbit

Machine-in-the-Loop Rewriting for Creative Image Captioning

3D-Reconstruction 基于深度学习方法的单目多视图三维重建

Image Segmentation using U-Net, U-Net with skip connections and M-Net architectures

PyTorch implementation of 1712.06087 "Zero-Shot" Super-Resolution using Deep Internal Learning

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Heterogeneous Temporal Graph Neural Network

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

NeuralDiff: Segmenting 3D objects that move in egocentric videos

SAN for Product Attributes Prediction

The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

A Python library for common tasks on 3D point clouds