PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Related tags

Deep LearningPSANet
Overview

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction)

by Hengshuang Zhao*, Yi Zhang*, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia, details are in project page.

Introduction

This repository is build for PSANet, which contains source code for PSA module and related evaluation code. For installation, please merge the related layers and follow the description in PSPNet repository (test with CUDA 7.0/7.5 + cuDNN v4).

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Usage

  1. Clone the repository recursively:

    git clone --recursive https://github.com/hszhao/PSANet.git
  2. Merge the caffe layers into PSPNet repository:

    Point-wise spatial attention: pointwise_spatial_attention_layer.hpp/cpp/cu and caffe.proto.

  3. Build Caffe and matcaffe:

    cd $PSANET_ROOT/PSPNet
    cp Makefile.config.example Makefile.config
    vim Makefile.config
    make -j8 && make matcaffe
    cd ..
  4. Evaluation:

    • Evaluation code is in folder 'evaluation'.

    • Download trained models and put them in related dataset folder under 'evaluation/model', refer 'README.md'.

    • Modify the related paths in 'eval_all.m':

      Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.

    cd evaluation
    vim eval_all.m
    • Run the evaluation scripts:
    ./run.sh
    
  5. Results:

    Predictions will show in folder 'evaluation/mc_result' and the expected scores are listed as below:

    (mIoU/pAcc. stands for mean IoU and pixel accuracy, 'ss' and 'ms' denote single scale and multiple scale testing.)

    ADE20K:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 train val 41.92/80.17 42.97/80.92 a8e884
    PSANet101 train val 42.75/80.71 43.77/81.51 ab5e56

    VOC2012:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 train_aug val 77.24/94.88 78.14/95.12 d5fc37
    PSANet101 train_aug val 78.51/95.18 79.77/95.43 5d8c0f
    PSANet101 COCO + train_aug + val test -/- 85.7/- 3c6a69

    Cityscapes:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 fine_train fine_val 76.65/95.99 77.79/96.24 25c06a
    PSANet101 fine_train fine_val 77.94/96.10 79.05/96.30 3ac1bf
    PSANet101 fine_train fine_test -/- 78.6/- 3ac1bf
    PSANet101 fine_train + fine_val fine_test -/- 80.1/- 1dfc91
  6. Demo video:

    • Video processed by PSANet (with PSPNet) on BDD dataset for drivable area segmentation: Video.

Citation

If PSANet is useful for your research, please consider citing:

@inproceedings{zhao2018psanet,
  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
  booktitle={ECCV},
  year={2018}
}

Questions

Please contact '[email protected]' or '[email protected]'

An self sufficient AI that crawls the web to learn how to generate art from keywords

Roxx-IO - The Smart Artist AI! TO DO / IDEAS Implement Web-Scraping Functionality Figure out a less annoying (and an off button for it) text to speech

Tatz 5 Mar 21, 2022
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

mqtt-camera-streamer Summary: Publish frames from a connected camera or MJPEG/RTSP stream to an MQTT topic, and view the feed in a browser on another

Robin Cole 183 Dec 16, 2022
f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation [Paper] [PyTorch] [MXNet] [Video] This repository provides code for training

Visual Understanding Lab @ Samsung AI Center Moscow 516 Dec 21, 2022
GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images

GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-

VITA 298 Dec 12, 2022
Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

Learning to Identify Top Elo Ratings We propose two algorithms MaxIn-Elo and MaxIn-mElo to solve the top players identification on the transitive and

2 Jan 14, 2022
Pytorch implement of 'Unmixing based PAN guided fusion network for hyperspectral imagery'

Pgnet There's a improved version compared with the publication in Tgrs with the modification in the deduction of the PDIN block: https://arxiv.org/abs

5 Jul 01, 2022
tmm_fast is a lightweight package to speed up optical planar multilayer thin-film device computation.

tmm_fast tmm_fast or transfer-matrix-method_fast is a lightweight package to speed up optical planar multilayer thin-film device computation. It is es

26 Dec 11, 2022
Official pytorch implement for “Transformer-Based Source-Free Domain Adaptation”

Official implementation for TransDA Official pytorch implement for “Transformer-Based Source-Free Domain Adaptation”. Overview: Result: Prerequisites:

stanley 54 Dec 22, 2022
Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)

MTTS-CAN: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement Paper Xin Liu, Josh Fromm, Shwetak Patel, Daniel M

Xin Liu 106 Dec 30, 2022
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

StyleAttack Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer" Prepare Pois

THUNLP 19 Nov 20, 2022
GluonMM is a library of transformer models for computer vision and multi-modality research

GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon

42 Dec 02, 2022
General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

QData 154 Dec 21, 2022
https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

THUML: Machine Learning Group @ THSS 149 Dec 19, 2022
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos

Manim is an engine for precise programmatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This rep

Grant Sanderson 49k Jan 09, 2023
Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

Anton Osokin 95 Nov 25, 2022
商品推荐系统

商品top50推荐系统 问题建模 本项目的数据集给出了15万左右的用户以及12万左右的商品, 以及对应的经过脱敏处理的用户特征和经过预处理的商品特征,旨在为用户推荐50个其可能购买的商品。 推荐系统架构方案 本项目采用传统的召回+排序的方案。

107 Dec 29, 2022
PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Learning Category-Specific Mesh Reconstruction from Image Collections Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik University

438 Dec 22, 2022
Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

Jinrong Yang 237 Dec 27, 2022