Decorators for maximizing memory utilization with PyTorch & CUDA

Overview

torch-max-mem

Tests Cookiecutter template from @cthoyt PyPI PyPI - Python Version PyPI - License Documentation Status Code style: black

This package provides decorators for memory utilization maximization with PyTorch and CUDA by starting with a maximum parameter size and applying successive halving until no more out-of-memory exception occurs.

๐Ÿ’ช Getting Started

Assume you have a function for batched computation of nearest neighbors using brute-force distance calculation.

import torch

def knn(x, y, batch_size, k: int = 3):
    return torch.cat(
        [
            torch.cdist(x[start : start + batch_size], y).topk(k=k, dim=1, largest=False).indices
            for start in range(0, x.shape[0], batch_size)
        ],
        dim=0,
    )

With torch_max_mem you can decorate this function to reduce the batch size until no more out-of-memory error occurs.

import torch
from torch_max_mem import maximize_memory_utilization


@maximize_memory_utilization(parameter_name="batch_size")
def knn(x, y, batch_size, k: int = 3):
    return torch.cat(
        [
            torch.cdist(x[start : start + batch_size], y).topk(k=k, dim=0, largest=False).indices
            for start in range(0, x.shape[0], batch_size)
        ],
        dim=0,
    )

In the code, you can now always pass the largest sensible batch size, e.g.,

x = torch.rand(100, 100, device="cuda")
y = torch.rand(200, 100, device="cuda")
knn(x, y, batch_size=x.shape[0])

๐Ÿš€ Installation

The most recent release can be installed from PyPI with:

$ pip install torch_max_mem

The most recent code and data can be installed directly from GitHub with:

$ pip install git+https://github.com/mberr/torch-max-mem.git

To install in development mode, use the following:

$ git clone git+https://github.com/mberr/torch-max-mem.git
$ cd torch-max-mem
$ pip install -e .

๐Ÿ‘ Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

๐Ÿ‘‹ Attribution

Parts of the logic have been developed with Laurent Vermue for PyKEEN.

โš–๏ธ License

The code in this package is licensed under the MIT License.

๐Ÿช Cookiecutter

This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.

๐Ÿ› ๏ธ For Developers

See developer instrutions

The final section of the README is for if you want to get involved by making a code contribution.

๐Ÿฅผ Testing

After cloning the repository and installing tox with pip install tox, the unit tests in the tests/ folder can be run reproducibly with:

$ tox

Additionally, these tests are automatically re-run with each commit in a GitHub Action.

๐Ÿ“– Building the Documentation

$ tox -e docs

๐Ÿ“ฆ Making a Release

After installing the package in development mode and installing tox with pip install tox, the commands for making a new release are contained within the finish environment in tox.ini. Run the following from the shell:

$ tox -e finish

This script does the following:

  1. Uses Bump2Version to switch the version number in the setup.cfg and src/torch_max_mem/version.py to not have the -dev suffix
  2. Packages the code in both a tar archive and a wheel
  3. Uploads to PyPI using twine. Be sure to have a .pypirc file configured to avoid the need for manual input at this step
  4. Push to GitHub. You'll need to make a release going with the commit where the version was bumped.
  5. Bump the version to the next patch. If you made big changes and want to bump the version by minor, you can use tox -e bumpversion minor after.
You might also like...
Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We have upgraded the point cloud modules of SPH3D-GCN from homogeneous to heterogeneous representations, and included the upgraded modules into this latest work as well. We are happy to announce that the work is accepted to IEEE CVPR2021.

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

Example repository for custom C++/CUDA operators for TorchScript

Custom TorchScript Operators Example This repository contains examples for writing, compiling and using custom TorchScript operators. See here for the

Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust.

Demo BERT ONNX pipeline written in rust This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust. R

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA
CUDA Python Low-level Bindings

CUDA Python Low-level Bindings

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun

An addernet CUDA version

Training addernet accelerated by CUDA Usage cd adder_cuda python setup.py install cd .. python main.py Environment pytorch 1.10.0 CUDA 11.3 benchmark

Comments
  • Import error

    Import error

    When trying to run the example from the README, I currently get the following error

    Traceback (most recent call last):
      File ".../torch_max_mem/tmp.py", line 2, in <module>
        from torch_max_mem import maximize_memory_utilization
    ModuleNotFoundError: No module named 'torch_max_mem'
    

    When I check pip list, the package name appears to be the stylized name

    $ pip list | grep max
    torch-max-mem     0.0.1.dev0 .../torch_max_mem/src
    
    opened by mberr 2
  • Add simplified key hasher

    Add simplified key hasher

    This PR adds a simplification for creating hashers based on the values associated to a subse of keys without having to define a lambda or named function.

    opened by mberr 1
  • Code fails for KEYWORD_ONLY params

    Code fails for KEYWORD_ONLY params

    The following snippet

    from torch_max_mem import maximize_memory_utilization
    
    
    @maximize_memory_utilization()
    def func(a, *bs, batch_size: int):
        pass
    

    raises an error

    Traceback (most recent call last):
      File ".../tmp.py", line 5, in <module>
        def func(a, *bs, batch_size: int):
      File ".../venv/venv-cpu/lib/python3.8/site-packages/torch_max_mem/api.py", line 274, in __call__
        wrapped = maximize_memory_utilization_decorator(
      File ".../venv/venv-cpu/lib/python3.8/site-packages/torch_max_mem/api.py", line 150, in decorator_maximize_memory_utilization
        raise ValueError(f"{parameter_name} must be a keyword based parameter, but is {_parameter.kind}.")
    ValueError: batch_size must be a keyword based parameter, but is KEYWORD_ONLY.
    

    since _parameter.kind is KEYWORD_ONLY.

    This is overly restrictive, since we only need keyword-based parameters.

    opened by mberr 0
  • stateful decorator

    stateful decorator

    Add a decorator which remembers to maximum parameter value for next time. Since this is handled internally, we do not need to expose the found parameter value to the outside, leaving the method signature unchanged.

    opened by mberr 0
Releases(v0.0.4)
  • v0.0.4(Aug 18, 2022)

    What's Changed

    • Fix ad hoc key hashing by @mberr in https://github.com/mberr/torch-max-mem/pull/7
    • Fix default value handling by @mberr in https://github.com/mberr/torch-max-mem/pull/8

    Full Changelog: https://github.com/mberr/torch-max-mem/compare/v0.0.3...v0.0.4

    Source code(tar.gz)
    Source code(zip)
  • v0.0.3(Aug 18, 2022)

    What's Changed

    • Fix keyword only params by @mberr in https://github.com/mberr/torch-max-mem/pull/6

    Full Changelog: https://github.com/mberr/torch-max-mem/compare/v0.0.2...v0.0.3

    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(May 6, 2022)

    What's Changed

    • Add simplified key hasher by @mberr in https://github.com/mberr/torch-max-mem/pull/3
    • Update README & doc by @mberr in https://github.com/mberr/torch-max-mem/pull/4

    Full Changelog: https://github.com/mberr/torch-max-mem/compare/v0.0.1...v0.0.2

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Feb 1, 2022)

Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021

RSNA AI Deep Learning Lab 2021 Intro Welcome Deep Learners! This document provides all the information you need to participate in the RSNA AI Deep Lea

RSNA 65 Dec 16, 2022
[ICLR2021oral] Rethinking Architecture Selection in Differentiable NAS

DARTS-PT Code accompanying the paper ICLR'2021: Rethinking Architecture Selection in Differentiable NAS Ruochen Wang, Minhao Cheng, Xiangning Chen, Xi

Ruochen Wang 86 Dec 27, 2022
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Vision Research Lab @ UCSB 26 Nov 29, 2022
The code release of paper Low-Light Image Enhancement with Normalizing Flow

[AAAI 2022] Low-Light Image Enhancement with Normalizing Flow Paper | Project Page Low-Light Image Enhancement with Normalizing Flow Yufei Wang, Renji

Yufei Wang 176 Jan 06, 2023
๐Ÿ† The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval ๐Ÿ† The 1st Place Submission to AICity Challenge 2021 Natural

82 Dec 29, 2022
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Mohamed Afham 152 Dec 23, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: ็ฎ€ไฝ“ไธญๆ–‡ | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

Tianning Yuan 269 Dec 21, 2022
Demonstrates iterative FGSM on Apple's NeuralHash model.

apple-neuralhash-attack Demonstrates iterative FGSM on Apple's NeuralHash model. TL;DR: It is possible to apply noise to CSAM images and make them loo

Lim Swee Kiat 11 Jun 23, 2022
Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

47 Jun 30, 2022
Neural Fixed-Point Acceleration for Convex Optimization

Licensing The majority of neural-scs is licensed under the CC BY-NC 4.0 License, however, portions of the project are available under separate license

Facebook Research 27 Oct 06, 2022
PyTorch implementation of EigenGAN

PyTorch Implementation of EigenGAN Train python train.py [image_folder_path] --name [experiment name] Test python test.py [ckpt path] --traverse FFH

62 Nov 12, 2022
Rafael Project- Classifying rockets to different types using data science algorithms.

Rocket-Classify Rafael Project- Classifying rockets to different types using data science algorithms. In this project we received data base with data

Hadassah Engel 5 Sep 18, 2021
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

WebDataset WebDataset is a PyTorch Dataset (IterableDataset) implementation providing efficient access to datasets stored in POSIX tar archives and us

1.1k Jan 08, 2023
BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

Pre-trained checkpoint and bert config json file Location of checkpoint and bert config json file This MLCommons members Google Drive location contain

SAIT (Samsung Advanced Institute of Technology) 12 Apr 27, 2022
Official implementation of Deep Burst Super-Resolution

Deep-Burst-SR Official implementation of Deep Burst Super-Resolution Publication: Deep Burst Super-Resolution. Goutam Bhat, Martin Danelljan, Luc Van

Goutam Bhat 113 Dec 19, 2022
ExCon: Explanation-driven Supervised Contrastive Learning

ExCon: Explanation-driven Supervised Contrastive Learning Contributors of this repo: Zhibo Zhang ( Zhibo (Darren) Zhang 18 Nov 01, 2022

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrรฉs Romero 37 Nov 27, 2022
This library is a location of the LegacyLogger for PyTorch Lightning.

neptune-contrib Documentation See neptune-contrib documentation site Installation Get prerequisites python versions 3.5.6/3.6 are supported Install li

neptune.ai 26 Oct 07, 2021
Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

TGIN Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction". Files in the folder dataset/ electr

Alibaba 21 Dec 21, 2022
An atmospheric growth and evolution model based on the EVo degassing model and FastChem 2.0

EVolve Linking planetary mantles to atmospheric chemistry through volcanism using EVo and FastChem. Overview EVolve is a linked mantle degassing and a

Pip Liggins 2 Jan 17, 2022