profile tools for pytorch nn models

Related tags

Text Data & NLPnnprof
Overview

nnprof

Introduction

nnprof is a profile tool for pytorch neural networks.

Features

  • multi profile mode: nnprof support 4 profile mode: Layer level, Operation level, Mixed level, Layer Tree level. Please check below for detail usage.
  • time and memory profile: nnprof support both time and memory profile now. But since memory profile is first supported in pytorch 1.6, please use torch version >= 1.6 for memory profile.
  • support sorted by given key and show profile percent: user could print table with percentage and sorted profile info using a given key, which is really helpful for optimiziing neural network.

Requirements

  • Python >= 3.6
  • PyTorch
  • Numpy

Get Started

install nnprof

  • pip install:
pip install nnprof
  • from source:
python -m pip install 'git+https://github.com/FateScript/nnprof.git'

# or install after clone this repo
git clone https://github.com/FateScript/nnprof.git
pip install -e nnprof

use nnprf

from nnprof import profile, ProfileMode
import torch
import torchvision

model = torchvision.models.alexnet(pretrained=False)
x = torch.rand([1, 3, 224, 224])

# mode could be anyone in LAYER, OP, MIXED, LAYER_TREE
mode = ProfileMode.LAYER

with profile(model, mode=mode) as prof:
    y = model(x)

print(prof.table(average=False, sorted_by="cpu_time"))
# table could be sorted by presented header.

Part of presented table looks like table below, Note that they are sorted by cpu_time.

╒══════════════════════╤═══════════════════╤═══════════════════╤════════╕
│ name                 │ self_cpu_time     │ cpu_time          │   hits │
╞══════════════════════╪═══════════════════╪═══════════════════╪════════╡
│ AlexNet.features.0   │ 19.114ms (34.77%) │ 76.383ms (45.65%) │      1 │
├──────────────────────┼───────────────────┼───────────────────┼────────┤
│ AlexNet.features.3   │ 5.148ms (9.37%)   │ 20.576ms (12.30%) │      1 │
├──────────────────────┼───────────────────┼───────────────────┼────────┤
│ AlexNet.features.8   │ 4.839ms (8.80%)   │ 19.336ms (11.56%) │      1 │
├──────────────────────┼───────────────────┼───────────────────┼────────┤
│ AlexNet.features.6   │ 4.162ms (7.57%)   │ 16.632ms (9.94%)  │      1 │
├──────────────────────┼───────────────────┼───────────────────┼────────┤
│ AlexNet.features.10  │ 2.705ms (4.92%)   │ 10.713ms (6.40%)  │      1 │
├──────────────────────┼───────────────────┼───────────────────┼────────┤

You are welcomed to try diffierent profile mode and more table format.

Contribution

Any issues and pull requests are welcomed.

Acknowledgement

Some thoughts of nnprof are inspired by torchprof and torch.autograd.profile . Many thanks to the authors.

Owner
Feng Wang
Cleaner @ Megvii
Feng Wang
Stuff related to Ben Eater's 8bit breadboard computer

8bit breadboard computer simulator This is an assembler + simulator/emulator of Ben Eater's 8bit breadboard computer. For a version with its RAM upgra

Marijn van Vliet 29 Dec 29, 2022
Pipeline for training LSA models using Scikit-Learn.

Latent Semantic Analysis Pipeline for training LSA models using Scikit-Learn. Usage Instead of writing custom code for latent semantic analysis, you j

Dani El-Ayyass 23 Sep 05, 2022
Blazing fast language detection using fastText model

Luga A blazing fast language detection using fastText's language models Luga is a Swahili word for language. fastText provides a blazing fast language

Prayson Wilfred Daniel 18 Dec 20, 2022
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

pyannote 2.2k Jan 09, 2023
Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (

Facebook Research 1.4k Dec 29, 2022
A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

Basic-UI-for-GPT-J-6B-with-low-vram A repository to run GPT-J-6B on low vram systems by using both ram, vram and pinned memory. There seem to be some

90 Dec 25, 2022
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

BROS (BERT Relying On Spatiality) is a pre-trained language model focusing on text and layout for better key information extraction from documents. Given the OCR results of the document image, which

Clova AI Research 94 Dec 30, 2022
Pytorch NLP library based on FastAI

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick

Agis pof 283 Nov 21, 2022
Bnagla hand written document digiiztion

Bnagla hand written document digiiztion This repo addresses the problem of digiizing hand written documents in Bangla. Documents have definite fields

Mushfiqur Rahman 1 Dec 10, 2021
100+ Chinese Word Vectors 上百种预训练中文词向量

Chinese Word Vectors 中文词向量 中文 This project provides 100+ Chinese Word Vectors (embeddings) trained with different representations (dense and sparse),

embedding 10.4k Jan 09, 2023
null

CP-Cluster Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segme

Yichun Shen 41 Dec 08, 2022
SDL: Synthetic Document Layout dataset

SDL is the project that synthesizes document images. It facilitates multiple-level labeling on document images and can generate in multiple languages.

Sơn Nguyễn 0 Oct 07, 2021
A programming language with logic of Python, and syntax of all languages.

Pytov The idea was to take all well known syntaxes, and combine them into one programming language with many posabilities. Installation Install using

Yuval Rosen 14 Dec 07, 2022
A paper list of pre-trained language models (PLMs).

Large-scale pre-trained language models (PLMs) such as BERT and GPT have achieved great success and become a milestone in NLP.

RUCAIBox 124 Jan 02, 2023
History Aware Multimodal Transformer for Vision-and-Language Navigation

History Aware Multimodal Transformer for Vision-and-Language Navigation This repository is the official implementation of History Aware Multimodal Tra

Shizhe Chen 46 Nov 23, 2022
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0 🤗 Transformers provides thousands of pretrained models to perform tasks o

Hugging Face 77.3k Jan 03, 2023
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok

Hugging Face 6.2k Dec 31, 2022
Quantifiers and Negations in RE Documents

Quantifiers-and-Negations-in-RE-Documents This project was part of my work for a

Nicolas Ruscher 1 Feb 01, 2022
Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

"Kötü söz sahibine aittir." -Anonim Nedir? sinkaf uygunsuz yorumların bulunmasını sağlayan bir python kütüphanesidir. Farkı nedir? Diğer algoritmalard

KaraGoz 4 Feb 18, 2022
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

fluz 11 Nov 16, 2022