PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

Overview

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

@misc{CV2018,
  author =       {Donny You ([email protected])},
  howpublished = {\url{https://github.com/donnyyou/PyTorchCV}},
  year =         {2018}
}

This repository provides source code for some deep learning based cv problems. We'll do our best to keep this repository up to date. If you do find a problem about this repository, please raise it as an issue. We will fix it immediately.

Implemented Papers

  • Image Classification

    • VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
    • ResNet: Deep Residual Learning for Image Recognition
    • DenseNet: Densely Connected Convolutional Networks
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
    • ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
  • Semantic Segmentation

    • DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
    • PSPNet: Pyramid Scene Parsing Network
    • DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
  • Object Detection

    • SSD: Single Shot MultiBox Detector
    • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    • YOLOv3: An Incremental Improvement
    • FPN: Feature Pyramid Networks for Object Detection
  • Pose Estimation

    • CPM: Convolutional Pose Machines
    • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
  • Instance Segmentation

    • Mask R-CNN

Performances with PyTorchCV

Image Classification

  • ResNet: Deep Residual Learning for Image Recognition

Semantic Segmentation

  • PSPNet: Pyramid Scene Parsing Network
Model Backbone Training data Testing data mIOU Pixel Acc Setting
PSPNet Origin 3x3-ResNet101 ADE20K train ADE20K val 41.96 80.64 -
PSPNet Ours 7x7-ResNet101 ADE20K train ADE20K val 44.18 80.91 PSPNet

Object Detection

  • SSD: Single Shot MultiBox Detector
Model Backbone Training data Testing data mAP FPS Setting
SSD-300 Origin VGG16 VOC07+12 trainval VOC07 test 0.772 - -
SSD-300 Ours VGG16 VOC07+12 trainval VOC07 test 0.786 - SSD300
SSD-512 Origin VGG16 VOC07+12 trainval VOC07 test 0.798 - -
SSD-512 Ours VGG16 VOC07+12 trainval VOC07 test 0.808 - SSD512
  • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Model Backbone Training data Testing data mAP FPS Setting
Faster R-CNN Origin VGG16 VOC07 trainval VOC07 test 0.699 - -
Faster R-CNN Ours VGG16 VOC07 trainval VOC07 test 0.706 - Faster R-CNN
  • YOLOv3: An Incremental Improvement

Pose Estimation

  • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Instance Segmentation

  • Mask R-CNN

Commands with PyTorchCV

Take PSPNet as an example. ("tag" could be any string, include an empty one.)

  • Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
  • Resume Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
  • Validate
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh val tag
  • Testing:
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh test tag

Examples with PyTorchCV

Example output of VGG19-OpenPose

Example output of VGG19-OpenPose

Combinatorial model of ligand-receptor binding

Combinatorial model of ligand-receptor binding The binding of ligands to receptors is the starting point for many import signal pathways within a cell

Mobolaji Williams 0 Jan 09, 2022
Real Time Object Detection and Classification using Yolo Algorithm.

Real time Object detection & Classification using YOLO algorithm. Real Time Object Detection and Classification using Yolo Algorithm. What is Object D

Ketan Chawla 1 Apr 17, 2022
Zeyuan Chen, Yangchao Wang, Yang Yang and Dong Liu.

Principled S2R Dehazing This repository contains the official implementation for PSD Framework introduced in the following paper: PSD: Principled Synt

zychen 78 Dec 30, 2022
A modular, research-friendly framework for high-performance and inference of sequence models at many scales

T5X T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of

Google Research 1.1k Jan 08, 2023
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Talk-to-Edit (ICCV2021) This repository contains the implementation of the following paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog Yumin

Yuming Jiang 221 Jan 07, 2023
A deep learning based semantic search platform that computes similarity scores between provided query and documents

semanticsearch This is a deep learning based semantic search platform that computes similarity scores between provided query and documents. Documents

1 Nov 30, 2021
tf2-keras implement yolov5

YOLOv5 in tesnorflow2.x-keras yolov5数据增强jupyter示例 Bilibili视频讲解地址: 《yolov5 解读,训练,复现》 Bilibili视频讲解PPT文件: yolov5_bilibili_talk_ppt.pdf Bilibili视频讲解PPT文件:

yangcheng 254 Jan 08, 2023
(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image

MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image This repo contains the source code for MobileHand, real-time estimation of 3D

90 Dec 12, 2022
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 08, 2023
Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

Riggable 3D Face Reconstruction via In-Network Optimization Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimizati

130 Jan 02, 2023
A proof of concept ai-powered Recaptcha v2 solver

Recaptcha Fullauto I've decided to open source my old Recaptcha v2 solver. My latest version will be opened sourced this summer. I am hoping this proj

Nate 60 Dec 20, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
Directed Greybox Fuzzing with AFL

AFLGo: Directed Greybox Fuzzing AFLGo is an extension of American Fuzzy Lop (AFL). Given a set of target locations (e.g., folder/file.c:582), AFLGo ge

380 Nov 24, 2022
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 08, 2023
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 03, 2023
Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization Code for reproducing our results in the Head2Toe paper. Paper: arxiv.or

Google Research 62 Dec 12, 2022
Official pytorch code for SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal

SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal This is the official pytorch code for SSAT: A Symmetric Semantic-

ForeverPupil 57 Dec 13, 2022
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
Chinese Advertisement Board Identification(Pytorch)

Chinese-Advertisement-Board-Identification. We use YoloV5 to extract the ROI of the location of the chinese word. Next, we sort the bounding box and recognize every chinese words which we extracted.

Li-Wei Hsiao 12 Jul 21, 2022