Pytorch cuda extension of grid_sample1d

Overview

Grid Sample 1d

pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The forward pass is 2~3x faster than pytorch grid sample.

setup

  • Pytorch == 1.7.1
  • CUDA == 10.1

Other versions of pytorch or cuda may work but I haven't test.

you can choose to manually build it or use JIT

Build

python setup.py install

JIT

comment import grid_sample1d_cuda as grid_sample1d in op.py

uncomment

grid_sample1d = load(
    'grid_sample1d_cuda', ['grid_sample1d_cuda.cpp', 'grid_sample1d_cuda_kernel.cu'], verbose=True)

in op.py

Usage

import torch
from grid_sample1d import GridSample1d

grid_sample1d = GridSample1d(padding_mode=True, align_corners=True)
N = 16
C = 256
L_in = 64
L_out = 128
input = torch.randn((N, C, L_in)).cuda()
grids = torch.randn((N, L_out)).cuda()
output = grid_sample1d(input, grids)

Options are

  • padding_mode: True for border padding, False for zero padding
  • align_corners: same with align_corners in torch.nn.functional.grid_sample

difference

In forward pass, calculation on the channel dim C is parallel, which is serial in torch.nn.functional.grid_sample. Parallel calculation on C may cause round off error in backward. But for now, I found it doesn't influence the forward pass.

Test

Accuracy Test

Since grid sample 1d is a special case of grid sample 2d in most cases (not true when padding_mode & align_corners are both False). I test the accuracy of the implemented grid sample based on torch.nn.functional.grid_sample.

import torch
import torch.nn.functional as F


def gridsample1d_by2d(input, grid, padding_mode, align_corners):
    shape = grid.shape
    input = input.unsqueeze(-1)  # batch_size * C * L_in * 1
    grid = grid.unsqueeze(1)  # batch_size * 1 * L_out
    grid = torch.stack([-torch.ones_like(grid), grid], dim=-1)
    z = F.grid_sample(input, grid, padding_mode=padding_mode, align_corners=align_corners)
    C = input.shape[1]
    out_shape = [shape[0], C, shape[1]]
    z = z.view(*out_shape)  # batch_size * C * L_out
    return z

It is recommended to test on your computer because I only test it on CUDA 10.1 GTX 1080Ti

python test/acc_benchmark.py

Both the forward and the backward results are identical except for align_corners=True, padding_mode=False. It may be caused by round off error when we sum series float numbers in different orders.

Deterministic Test

It is very important to do deterministic test since the associative law is no more applied for the calculation of float numbers on computers.

python test/check_deterministic.py

Note

When padding_mode & align_corners are both False, we cannot regard grid sample 1d as a special case of grid sample 2d in pytorch. I have checked the cuda kernel of grid_sample in Pytorch. When padding_mode & align_corners are both False, the output of torch.nn.functional.grid_sample will be half of the expected. Hope it can be fixed one day.

CPU support

Too lazy to support

speed & memory cost

Here are the speed test results on different size of input

references

Owner
lyricpoem
lyricpoem
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023
Repository for training material for the 2022 SDSC HPC/CI User Training Course

hpc-training-2022 Repository for training material for the 2022 SDSC HPC/CI Training Series HPC/CI Training Series home https://www.sdsc.edu/event_ite

sdsc-hpc-training-org 21 Jul 27, 2022
PyTorch implementation of PP-LCNet

PP-LCNet-Pytorch Pre-Trained Models Google Drive p018 Accuracy Models Top1 Top5 PPLCNet_x0_25 0.5186 0.7565 PPLCNet_x0_35 0.5809 0.8083 PPLCNet_x0_5 0

24 Dec 12, 2022
Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples"

KSTER Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples" [paper]. Usage Download the processed datas

jiangqn 23 Nov 24, 2022
Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model. Designed sample dashboard with insights and recommendation for

Yash 2 Apr 07, 2022
PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

Responsible AI Toolbox A library that provides high-quality, PyTorch-centric tools for evaluating and enhancing both the robustness and the explainabi

24 Dec 22, 2022
Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods."

pv_predict_unet-lstm Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods." IEEE Transactions

FolkScientistInDL 8 Oct 08, 2022
FastyAPI is a Stack boilerplate optimised for heavy loads.

FastyAPI A FastAPI based Stack boilerplate for heavy loads. Explore the docs » View Demo · Report Bug · Request Feature Table of Contents About The Pr

Ali Chaayb 47 Dec 27, 2022
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
ViSD4SA, a Vietnamese Span Detection for Aspect-based sentiment analysis dataset

UIT-ViSD4SA PACLIC 35 General Introduction This repository contains the data of the paper: Span Detection for Vietnamese Aspect-Based Sentiment Analys

Nguyễn Thị Thanh Kim 5 Nov 13, 2022
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

Ethan 51 Nov 17, 2022
Amazing-Python-Scripts - 🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

📑 Introduction A curated collection of Amazing Python scripts from Basics to Advance with automation task scripts. This is your Personal space to fin

Avinash Ranjan 1.1k Dec 29, 2022
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN Official PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. Prerequisites Python 2.7

SK T-Brain 754 Dec 29, 2022
Out-of-Distribution Generalization of Chest X-ray Using Risk Extrapolation

OoD_Gen-Chest_Xray Out-of-Distribution Generalization of Chest X-ray Using Risk Extrapolation Requirements (Installations) Install the following libra

Enoch Tetteh 2 Oct 01, 2022
Robustness between the worst and average case

Robustness between the worst and average case A repository that implements intermediate robustness training and evaluation from the NeurIPS 2021 paper

CMU Locus Lab 16 Dec 02, 2022
Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

Machine learning enabling high-throughput and remote operations at large-scale user facilities. Overview This repository contains the source code and

BNL 4 Sep 24, 2022
Boosted CVaR Classification (NeurIPS 2021)

Boosted CVaR Classification Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar NeurIPS 2021 Table of Contents Quick Start Train

Runtian Zhai 4 Feb 15, 2022
This repository provides an efficient PyTorch-based library for training deep models.

s3sec Test AWS S3 buckets for read/write/delete access This tool was developed to quickly test a list of s3 buckets for public read, write and delete

Bytedance Inc. 123 Jan 05, 2023
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022
Open source annotation tool for machine learning practitioners.

doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ

7.1k Jan 01, 2023