RoIAlign & crop_and_resize for PyTorch

Related tags

Deep Learningpytorch
Overview

RoIAlign for PyTorch

This is a PyTorch version of RoIAlign. This implementation is based on crop_and_resize and supports both forward and backward on CPU and GPU.

NOTE: Thanks meikuam for updating this repo for PyTorch 1.0. You can find the original version for torch <= 0.4.1 in pytorch_0.4 branch.

Introduction

The crop_and_resize function is ported from tensorflow, and has the same interface with tensorflow version, except the input feature map should be in NCHW order in PyTorch. They also have the same output value (error < 1e-5) for both forward and backward as we expected, see the comparision in test.py.

Note: Document of crop_and_resize can be found here. And RoIAlign is a wrap of crop_and_resize that uses boxes with unnormalized (x1, y1, x2, y2) as input (while crop_and_resize use normalized (y1, x1, y2, x2) as input). See more details about the difference of RoIAlign and crop_and_resize in tensorpack.

Warning: Currently it only works using the default GPU (index 0)

Usage

  • Install and test

    python setup.py install
    ./test.sh
    
  • Use RoIAlign or crop_and_resize

    Since PyTorch 1.2.0 Legacy autograd function with non-static forward method is deprecated. We use new-style autograd function with static forward method. Example:

    import torch
    from roi_align import RoIAlign      # RoIAlign module
    from roi_align import CropAndResize # crop_and_resize module
    
    # input feature maps (suppose that we have batch_size==2)
    image = torch.arange(0., 49).view(1, 1, 7, 7).repeat(2, 1, 1, 1)
    image[0] += 10
    print('image: ', image)
    
    
    # for example, we have two bboxes with coords xyxy (first with batch_id=0, second with batch_id=1).
    boxes = torch.Tensor([[1, 0, 5, 4],
                         [0.5, 3.5, 4, 7]])
    
    box_index = torch.tensor([0, 1], dtype=torch.int) # index of bbox in batch
    
    # RoIAlign layer with crop sizes:
    crop_height = 4
    crop_width = 4
    roi_align = RoIAlign(crop_height, crop_width)
    
    # make crops:
    crops = roi_align(image, boxes, box_index)
    
    print('crops:', crops)

    Output:

    image:  tensor([[[[10., 11., 12., 13., 14., 15., 16.],
          [17., 18., 19., 20., 21., 22., 23.],
          [24., 25., 26., 27., 28., 29., 30.],
          [31., 32., 33., 34., 35., 36., 37.],
          [38., 39., 40., 41., 42., 43., 44.],
          [45., 46., 47., 48., 49., 50., 51.],
          [52., 53., 54., 55., 56., 57., 58.]]],
    
    
        [[[ 0.,  1.,  2.,  3.,  4.,  5.,  6.],
          [ 7.,  8.,  9., 10., 11., 12., 13.],
          [14., 15., 16., 17., 18., 19., 20.],
          [21., 22., 23., 24., 25., 26., 27.],
          [28., 29., 30., 31., 32., 33., 34.],
          [35., 36., 37., 38., 39., 40., 41.],
          [42., 43., 44., 45., 46., 47., 48.]]]])
          
    crops: tensor([[[[11.0000, 12.0000, 13.0000, 14.0000],
              [18.0000, 19.0000, 20.0000, 21.0000],
              [25.0000, 26.0000, 27.0000, 28.0000],
              [32.0000, 33.0000, 34.0000, 35.0000]]],
    
    
            [[[24.5000, 25.3750, 26.2500, 27.1250],
              [30.6250, 31.5000, 32.3750, 33.2500],
              [36.7500, 37.6250, 38.5000, 39.3750],
              [ 0.0000,  0.0000,  0.0000,  0.0000]]]])
Owner
Long Chen
Computer Vision
Long Chen
The fastest way to visualize GradCAM with your Keras models.

VizGradCAM VizGradCam is the fastest way to visualize GradCAM in Keras models. GradCAM helps with providing visual explainability of trained models an

58 Nov 19, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

Real-ESRGAN Colab Demo for Real-ESRGAN . Portable Windows executable file. You can find more information here. Real-ESRGAN aims at developing Practica

Xintao 17.2k Jan 02, 2023
EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

EncT5 (Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks About Finetune T5 model for classification & r

Jangwon Park 34 Jan 01, 2023
Fuzzing the Kernel Using Unicornafl and AFL++

Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye

Security in Telecommunications 283 Dec 26, 2022
Code Repository for The Kaggle Book, Published by Packt Publishing

The Kaggle Book Data analysis and machine learning for competitive data science Code Repository for The Kaggle Book, Published by Packt Publishing "Lu

Packt 1.6k Jan 07, 2023
Sequential Model-based Algorithm Configuration

SMAC v3 Project Copyright (C) 2016-2018 AutoML Group Attention: This package is a reimplementation of the original SMAC tool (see reference below). Ho

AutoML-Freiburg-Hannover 778 Jan 05, 2023
Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation This repository contains the official implementation of our paper: Dense Unsupervised Learning for

Visual Inference Lab @TU Darmstadt 173 Dec 26, 2022
Make your master artistic punk avatar through machine learning world famous paintings.

Master-art-punk Make your master artistic punk avatar through machine learning world famous paintings. 通过机器学习世界名画制作属于你的大师级艺术朋克头像 Nowadays, NFT is beco

Philipjhc 53 Dec 27, 2022
This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Mask-RCNN on Mycobacterium tuberculosis This is an example of object detection on Mycobacterium Tuberculosis using Mask RCNN. Implement of Mask R-CNN

Jun-En Ding 1 Sep 16, 2021
Anonymize BLM Protest Images

Anonymize BLM Protest Images This repository automates @BLMPrivacyBot, a Twitter bot that shows the anonymized images to help keep protesters safe. Us

Stanford Machine Learning Group 40 Oct 13, 2022
EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers This repository contains the code to our paper published at ICCV 2021. For ques

Facebook Research 62 Dec 14, 2022
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit

BBC 16 Oct 27, 2022
Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval.

DARP-SBIR Intro This repository contains the source code implementation for ICDM submission paper Deep Reinforced Attention Regression for Partial Ske

2 Jan 09, 2022
Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

LMSOC: An Approach for Socially Sensitive Pretraining Code for reproducing the paper LMSOC: An Approach for Socially Sensitive Pretraining to appear a

Twitter Research 11 Dec 20, 2022
TensorFlow implementation of Deep Reinforcement Learning papers

Deep Reinforcement Learning in TensorFlow TensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains: [1] Playing A

Taehoon Kim 1.6k Jan 03, 2023
CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detecti

Bhchen 69 Dec 08, 2022
Edison AT is software Depression Assistant personal.

Edison AT Edison AT is software / program Depression Assistant personal. Feature: Analyze emotional real-time from face. Audio Edison(Comingsoon relea

Ananda Rauf 2 Apr 24, 2022
Semi-supervised Domain Adaptation via Minimax Entropy

Semi-supervised Domain Adaptation via Minimax Entropy (ICCV 2019) Install pip install -r requirements.txt The code is written for Pytorch 0.4.0, but s

Vision and Learning Group 243 Jan 09, 2023
Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

Google Interns 50 Dec 21, 2022