PyTorch extensions for fast R&D prototyping and Kaggle farming

Last update: Jan 05, 2023

Overview

Pytorch-toolbelt

A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming:

What's inside

Easy model building using flexible encoder-decoder architecture.
Modules: CoordConv, SCSE, Hypercolumn, Depthwise separable convolution and more.
GPU-friendly test-time augmentation TTA for segmentation and classification
GPU-friendly inference on huge (5000x5000) images
Every-day common routines (fix/restore random seed, filesystem utils, metrics)
Losses: BinaryFocalLoss, Focal, ReducedFocal, Lovasz, Jaccard and Dice losses, Wing Loss and more.
Extras for Catalyst library (Visualization of batch predictions, additional metrics)

Showcase: Catalyst, Albumentations, Pytorch Toolbelt example: Semantic Segmentation @ CamVid

Why

Honest answer is "I needed a convenient way to re-use code for my Kaggle career". During 2018 I achieved a Kaggle Master badge and this been a long path. Very often I found myself re-using most of the old pipelines over and over again. At some point it crystallized into this repository.

This lib is not meant to replace catalyst / ignite / fast.ai high-level frameworks. Instead it's designed to complement them.

Installation

pip install pytorch_toolbelt

How do I ...

Model creation

Create Encoder-Decoder U-Net model

Below a code snippet that creates vanilla U-Net model for binary segmentation. By design, both encoder and decoder produces a list of tensors, from fine (high-resolution, indexed 0) to coarse (low-resolution) feature maps. Access to all intermediate feature maps is beneficial if you want to apply deep supervision losses on them or encoder-decoder of object detection task, where access to intermediate feature maps is necessary.

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class UNet(nn.Module):
    def __init__(self, input_channels, num_classes):
        super().__init__()
        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])

Create Encoder-Decoder FPN model with pretrained encoder

Similarly to previous example, you can change decoder to FPN with contatenation.

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class SEResNeXt50FPN(nn.Module):
   def __init__(self, num_classes, fpn_channels):
       super().__init__()
       self.encoder = E.SEResNeXt50Encoder()
       self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
       self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

   def forward(self, x):
       x = self.encoder(x)
       x = self.decoder(x)
       return self.logits(x[0])

Change number of input channels for the Encoder

All encoders from pytorch_toolbelt supports changing number of input channels. Simply call encoder.change_input_channels(num_channels) and first convolution layer will be changed. Whenever possible, existing weights of convolutional layer will be re-used (in case new number of channels is greater than default, new weight tensor will be padded with randomly-initialized weigths). Class method returns self, so this call can be chained.

from pytorch_toolbelt.modules import encoders as E

encoder = E.SEResnet101Encoder()
encoder = encoder.change_input_channels(6)

Misc

Count number of parameters in encoder/decoder and other modules

When designing a model and optimizing number of features in neural network, I found it's quite useful to print number of parameters in high-level blocks (like encoder and decoder). Here is how to do it with pytorch_toolbelt:

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D
from pytorch_toolbelt.utils import count_parameters

class SEResNeXt50FPN(nn.Module):
    def __init__(self, num_classes, fpn_channels):
        super().__init__()
        self.encoder = E.SEResNeXt50Encoder()
        self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])

net = SEResNeXt50FPN(1, 128)
print(count_parameters(net))
# Prints {'total': 34232561, 'trainable': 34232561, 'encoder': 25510896, 'decoder': 8721536, 'logits': 129}

Compose multiple losses

There are multiple ways to combine multiple losses, and high-level DL frameworks like Catalyst offers way more flexible way to achieve this, but here's 100%-pure PyTorch implementation of mine:

from pytorch_toolbelt import losses as L

# Creates a loss function that is a weighted sum of focal loss 
# and lovasz loss with weigths 1.0 and 0.5 accordingly.
loss = L.JointLoss(L.FocalLoss(), L.LovaszLoss(), 1.0, 0.5)

TTA / Inferencing

Apply Test-time augmentation (TTA) for the model

Test-time augmetnation (TTA) can be used in both training and testing phases.

from pytorch_toolbelt.inference import tta

model = UNet()

# Truly functional TTA for image classification using horizontal flips:
logits = tta.fliplr_image2label(model, input)

# Truly functional TTA for image segmentation using D4 augmentation:
logits = tta.d4_image2mask(model, input)

Inference on huge images:

Quite often, there is a need to perform image segmentation for enormously big image (5000px and more). There are a few problems with such a big pixel arrays:

There are size limitations on maximum size of CUDA tensors (Concrete numbers depends on driver and GPU version)
Heavy CNNs architectures may eat up all available GPU memory with ease when inferencing relatively small 1024x1024 images, leaving no room to bigger image resolution.

One of the solutions is to slice input image into tiles (optionally overlapping) and feed each through model and concatenate the results back. In this way you can guarantee upper limit of GPU ram usage, while keeping ability to process arbitrary-sized images on GPU.

import numpy as np
from torch.utils.data import DataLoader
import cv2

from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy


image = cv2.imread('really_huge_image.jpg')
model = get_model(...)

# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(512, 512), tile_step=(256, 256))

# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]

# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)

# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in DataLoader(list(zip(tiles, tiler.crops)), batch_size=8, pin_memory=True):
    tiles_batch = tiles_batch.float().cuda()
    pred_batch = model(tiles_batch)

    merger.integrate_batch(pred_batch, coords_batch)

# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)

Advanced examples

Citation

@misc{Khvedchenya_Eugene_2019_PyTorch_Toolbelt,
  author = {Khvedchenya, Eugene},
  title = {PyTorch Toolbelt},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/BloodAxe/pytorch-toolbelt}},
  commit = {cc5e9973cdb0dcbf1c6b6e1401bf44b9c69e13f3}
}

Comments

Is compute_pyramid_patch_weight_loss correctly imlemented?

https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L33 can be deleted.

https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L28 https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L29

are never updated and stay zero?

P.S. Numpy is very slow. replacing sqrt and square speeds things up a lot.

opened by ternaus 7
Dice Loss/Score question
Hey Eugene,

First of all, thank you for this very useful package. I'm transferring my environment from TF to Pytorch now and having your advanced losses is very helpful. However, when I trained the same model on the same data using same loss functions in both frameworks, I noticed that I get very different loss numbers (I'm using multilabel approach). Digging a little deeper in your code I noticed that when you calculate the Dice Loss you always calculate per sample AND per channel loss and then average it. I don't understand why are you doing the per channel calculation ad averaging, and not the Dice loss for all classes together. I can show What I mean on a dummy example below:

Let's prepare 2 dummy multilabel matrices - ground truth (d_gt) and prediction (d_pr) with 3 classes each, 0 Red, 1 Green and 2 Blue: d_gt = np.zeros(shape=(20,20,3)) d_gt[5:10,5:10,0] =1 d_gt[10:15,10:15,1] =1 d_gt[:,:,2] = (1 - d_gt.sum(axis=-1, keepdims=True)).squeeze() plt.imshow(d_gt)

d_pr = np.zeros(shape=(20,20,3)) d_pr[4:9,4:9,0] =1 d_pr[11:14,11:14,1] =1 d_pr[:,:,2] = (1 - d_pr.sum(axis=-1, keepdims=True)).squeeze() plt.imshow(d_pr)

One can see that (using Dice Loss = 1- Dice Score):

Dice Loss for Red is 1- ((16+ 16) / (25+ 25)) = 0.36

Dice Loss for Green is 1 - ((9+9)/(9+25) = 0.4706

Dice Loss for Blue is 1 - ((341+341)/(350+366)) = 0.0474

However, total Dice Loss for the whole picture is 1 - (2*(16+9+341)/(2*400) = 0.085

After wrapping them into tensors d_gt_tensor = torch.from_numpy(np.transpose(d_gt,(2,0,1))).unsqueeze(0) d_pr_tensor = torch.from_numpy(np.transpose(d_pr,(2,0,1))).unsqueeze(0) what your Dice Loss (with from_logits=False) is returning is 0.2927 which is the averaged loss of individual channels instead of the total loss. The culprit seems to be passing dims=(0,2) to the soft_dice_score function, I think that dims=(1,2) should be passed instead to get individual scores for each item in the batch? Unless this behaviour is intended but then I'd need some more explanation why.

Second smaller question regrading your Dice Loss is why you use from_logits= True by default?

Thanks in advance!
opened by JanSobus 5
Is dependency on `opencv-python` necessary?

Depending on opencv-python makes it difficult to use the library in the docker environment since there is typically no gui. Would it be possible to depend on the opencv-python-headless instead?

Thanks.

opened by MikiGrit 4
integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

Hi, I'm trying to use your tiling tools with my yolov5 model but in the following line I get following error:

https://github.com/BloodAxe/pytorch-toolbelt/blob/cab4fc4e209d9c9e5db18cf1e01bb979c65cf08b/pytorch_toolbelt/inference/tiles.py#L341

RuntimeError: The size of tensor a (6) must match the size of tensor b (928) at non-singleton dimension 2

The debugger shows a tile tensor size of (52983,6) and a weight tensor size of (1, 928,928). What could be the reason for the difference in the tensor size?

Some more infos: model size: 928x928 image size is 3840*2160 I am leading the model using DetectMultiBackend from yolov5

opened by jokober 4

TypeError: object of type 'int' has no len()

I am unable to create a basic UNet model from the library as given on the readme. Here's the code for the same:

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class UNet(nn.Module):
    def __init__(self, input_channels, num_classes):
        super().__init__()
        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])
    
model= UNet(input_channels= 3, num_classes= 1)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-4e8064bebb83> in <module>
     15         return self.logits(x[0])
     16 
---> 17 model= UNet(input_channels= 3, num_classes= 1)

<ipython-input-1-4e8064bebb83> in __init__(self, input_channels, num_classes)
      7         super().__init__()
      8         self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
----> 9         self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
     10         self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
     11 

~/anaconda3/envs/dl_gpu/lib/python3.7/site-packages/pytorch_toolbelt/modules/decoders/unet.py in __init__(self, feature_maps, decoder_features, unet_block, upsample_block)
     38             decoder_features = [None] * num_blocks
     39         else:
---> 40             if len(decoder_features) != num_blocks:
     41                 raise ValueError(f"decoder_features must have length of {num_blocks}")
     42         in_channels_for_upsample_block = feature_maps[-1]

TypeError: object of type 'int' has no len()

opened by sainatarajan 4

Getting out of memory by using inference on huge images

I have tried pretty small slices but get cuda out of memory on ---> 23 pred_batch = best_model(tiles_batch)[:, 0:1, :,:] As I can see it finally preceded few steps but failed. I have GPU with 8 GB, model it`s unet but wuth heavy encoders. Image shape (6300, 6304, 3)

import numpy as np
import torch
import cv2
from tqdm import tqdm_notebook
from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy


image = img_to_predict

# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(64, 64), tile_step=(64, 64), weight='pyramid')

# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]

# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)

# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in tqdm_notebook(DataLoader(list(zip(tiles, tiler.crops)), batch_size=1, pin_memory=True)):
    tiles_batch = tiles_batch.float().cuda()
    pred_batch = best_model(tiles_batch)[:, 0:1, :,:] # taking only first channel

    merger.integrate_batch(pred_batch, coords_batch)

# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)

opened by Diyago 3

UnetSegmentationModel dimension won't match
I want to try hrnet34_unet64 for image segmentation using:

encoder = E.HRNetV2Encoder34(pretrained=pretrained, layers=[0, 1, 2, 3, 4]) UnetSegmentationModel(encoder, num_classes=num_classes, unet_channels=[64, 128, 256, 512], dropout=dropout)

And got an error: ``RuntimeError: Sizes of tensors must match except in dimension 2. Got 128 and 256 (The offending index is 0)```

Could you please let me know what is wrong? Thanks!
opened by xdtl 2
SoftCrossEntropyLoss error

When I use the SoftCrossEntropyLoss, I got the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Could anyone help me? BTW, what paper proposed the SoftCrossEntropyLoss?

opened by somebodyus 2
performance of ImageSlicer weight=pyramid

ImageSlicer with weight=pyramid is/was super slow to initialize. It is the weight used in README.md example "Inference on huge images". (in https://github.com/BloodAxe/pytorch-toolbelt/issues/23 performance was mentioned and I guess it was the reason people look at this code)

opened by ksenobojca 2
FocalLoss

🐛 Bug

There are two types of focal loss here (BinaryFocalLoss and FocalLoss): https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/losses/focal.py

Both of these functions are calling the focal_loss_with_logits function, while the second one should use softmax_focal_loss_with_logits.

opened by mehran66 1

Focal loss error

Multiclass Focal loss returns error.

    loss = criterion(preds, target)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
    return self.first(*input) + self.second(*input)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
    return self.loss(*input) * self.weight
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/focal.py", line 89, in forward
    loss += self.focal_loss_fn(cls_label_input, cls_label_target)
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/functional.py", line 45, in focal_loss_with_logits
    logpt = F.binary_cross_entropy_with_logits(output, target, reduction="none")
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
    raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([5, 1, 256, 256])) must be the same as input size (torch.Size([5, 256, 256]))
Exception ignored in: <function tqdm.__del__ at 0x7fd03260d400>
Traceback (most recent call last):
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

I think that line 83 in pytorch_toolbelt/losses/focal.py should be changed from cls_label_input = label_input[:, cls, ...] to cls_label_input = label_input[:, cls, ...].unsqueeze(1)

opened by vbakhteev 1

Detailed documentation is recommended

Thank you very much for making such a good library. It would be nice to have a more detailed document, for example, https://smp.readthedocs.io/en/latest/
enhancement Looking for contributors

opened by Hengwei-Zhao96 1

Releases(0.6.2)

0.6.2(Dec 25, 2022)

Source code(tar.gz)
Source code(zip)
0.6.1(Oct 25, 2022)
Pytorch Toolbelt 0.6.1

Fixes to CI actions

Adding support for python 3.10

Bugfix in DatasetMeanStdCalculator when mask argument was used

Source code(tar.gz)
Source code(zip)
0.6.0(Oct 20, 2022)

Breaking Changes

All catalyst-related callbacks are moved to fork of the Catalyst library
Source code(tar.gz)
Source code(zip)
0.5.3(Oct 20, 2022)
Bugfixes

Fix https://github.com/BloodAxe/pytorch-toolbelt/issues/78 thanks https://github.com/mehran66 for pointing this out

New Stuff

InriaAerialImageDataset for working with Inria Aerial Dataset

get_collate_for_dataset function to get collate fn if a dataset instance (argument) exposes get_collate_fn method. Works also for ConcatDataset.

Improvements

DatasetMeanStdCalculator supports dtype to specify accumulator type (float64 by default)

Source code(tar.gz)
Source code(zip)
0.5.2(Aug 26, 2022)
BugFixes

Fixed bug in ApplySoftmaxTo and ApplySigmoidTo modules that could lead to activations not applied to input when it was a string

New API

Added fs.find_images_in_dir_recursive

Added utils.describe_outputs to return a human-friendly representation of complex (dict, nested list, etc) outputs to see shape, mean/std of each tensor.

Other

More MyPy fixes & type annotations
Source code(tar.gz)
Source code(zip)
0.5.1(Jun 27, 2022)
New API

Added fs.find_subdirectories_in_dir to retrieve list of subdirectories (non-recursive) in the given directory.

Added logodd averaging of TTA predictions and counterpart logodd_mean function.

Improvements

In plot_confusion_matrix one can disable plotting scores in each cell using show_scores argument (True by default).

freeze_model method now returns input module argument.

Source code(tar.gz)
Source code(zip)
0.5.0(Mar 10, 2022)
Version 0.5.0

This is the major release update of Pytorch Toolbelt. It's been a long time since the last update and there are many improvements & updates since 0.4.4:

New features

Added class pytorch_toolbelt.datasets.DatasetMeanStdCalculator to compute mean & std of the dataset that does not fit entirely in memory.

New decoder module: BiFPNDecoder

New encoders: SwinTransformer, SwinB, SwinL, SwinT, SwinS

Added broadcast_from_master function to distributed utils. This method allows scattering a tensor from the master node to all nodes.

Added reduce_dict_sum to gather & concatenate dictionary of lists from all nodes in DDP.

Added master_print as a drop-in replacement to print that prints to stdout only on the zero-rank node.

Bug Fixes

Fix bug in lovasz loss by @seefun in https://github.com/BloodAxe/pytorch-toolbelt/pull/62

Breaking changes

Bounding boxes matching method has been divided into two: match_bboxes and match_bboxes_hungarian. The first method uses scores of predicted bboxes and matches most confident predictions first, while the match_bboxes_hungarian matches bboxes to maximize overall IoU.

set_manual_seed now sets random seed for Numpy.

to_numpy now correctly works for None and all iterables (Not only tuple & list)

Fixes & Improvements (NO BC)

Added dim argument to ApplySoftmaxTo to specify channel for softmax operator (default value is 1, which was hardcoded previously)

ApplySigmoidTo now applies in-place sigmoid (Purely performance optimization)

TileMerger now supports specifying a device (Torch semantics) for storing intermediate tensors of accumulated tiles.

All TTA functions supports PyTorch Tracing

MultiscaleTTA now supports a model that returns a single Tensor (Key-Value outputs still works as before)

balanced_binary_cross_entropy_with_logits and BalancedBCEWithLogitsLoss now supports ignore_index argument.

BiTemperedLogisticLoss & BinaryBiTemperedLogisticLoss also got support of ignore_index argument.

focal_loss_with_logits now also supports ignore_index. Computation of ignored values has been moved from BinaryFocalLoss to this function.

Reduced number of boilerplates & hardcoded code for encoders from timm. Now GenericTimmEncoder queries output strides & feature maps directly from the timm's encoder instance.

HRNet-based encoders now have a use_incre_features argument to specify whether output feature maps should have an increased number of features.

change_extension, read_rgb_image, read_image_as_is functions now supports Path as input argument. Return type (str) remains unchanged.

count_parameters now accepts human_friendly argument to print parameters count in human-friendly form 21.1M instead 21123123.

plot_confusion_matrix now has format_string argument (None by default) to specify custom format string for values in confusion matrix.

RocAucMetricCallback for Catalyst got fix_nans argument to fix NaN outputs, which caused roc_auc to raise an exception and break the training.

BestWorstMinerCallbac now additionally logs batch with NaN value in monitored metric

Source code(tar.gz)
Source code(zip)
0.4.4(Aug 12, 2021)
New features

New tiled processing classes for 3D data - VolumeSlicer and VolumeMerger. Designed similarly to ImageSlicer. Not you can run 3D segmentation on huge volumes without risk of OOM.

Support of labels (scalar or 1D vector) augmentation/deaugmentation in D2, D4 and flip-style TTA.

Balanced BCE loss (BalancedBCEWithLogitsLoss)

Bi-Tempered loss 'BiTemperedLogisticLoss'

SelectByIndex helper module to pick named output of the model (For use in nn.Sequential)

New encoders MobileNetV3Large, MobileNetV3Small from torchvision.

New encoders from timm package (HRNets, ResNetD, EfficientNetV2 and others).

DeepLabV3 & DeepLabV3+ Decoders

Pure PyTorch-based implementation for bbox matching (match_bboxes) that supports both CPU/GPU matching using hungarian algorithm.

Bugfixes

Fix bug in Lovasz Loss (#62), thanks @seefun

Breaking Changes

Parameter ignore renamed to ignore_index in BinaryLovaszLoss class.

Renamed fpn_channels argument in constructor of FPNSumDecoder and FPNCatDecoder to channels.

Renamed 'output_channelsargument in constructor ofHRNetSegmentationDecoderto 'channels.

conv1x1 not set bias to zero by default

Bumped up minimal pytorch version to 1.8.1

Other Improvements

Ensembler class not correctly works with torch.jit.tracing

Numerous docstrings & type annotations enchancements

Source code(tar.gz)
Source code(zip)
0.4.3(Apr 2, 2021)
PyTorch Toolbelt 0.4.3

Modules

Added missing sigmoid activation support to get_activation_block

Make Encoders support JIT & Tracing

Better support for encoders from timm (They named with prefix Timm)

Utils

rgb_image_from_tensor now clip values

TTA & Ensembling

Ensembler now supports arithmetic, geometric & harmonic averaging via reduction parameter.

Bring geometric & harmonic averaging to all TTA functions as well

Datasets

read_binary_mask

Refactor SegmentationDataset to support strided masks for deep supervision

Added RandomSubsetDataset and RandomSubsetWithMaskDataset to sample dataset based on some condition (E.g. sample only samples of particular class)

Other

As usual, more tests, better type annotations & comments
Source code(tar.gz)
Source code(zip)
0.4.2(Mar 3, 2021)
Breaking Changes

Bump up minimal PyTorch version to 1.7.1

New features

New dataset classes ClassificationDataset, SegmentationDataset for easy every-day use in Kaggle

New losses: FocalCosineLoss, BiTemperedLogisticLoss, SoftF1Loss

Support of new activations for get_activation_block (Silu, Softplus, Gelu)

More encoders from timm package: NFNets, NFRegNet, HRNet, DPN

RocAucMetricCallback for Catalyst

MultilabelAccuracyCallback and AccuracyCallback with DDP support

Bugfixes

Fix invalid prefix in catalyst registry to from tbt to tbt.

Source code(tar.gz)
Source code(zip)
0.4.1(Jan 14, 2021)
New features

Added Soft-F1 loss for direct optimization of F1 score (Binary case only)

Fully rework TTA (Kept backward compatibility where it's possible) module for inference.

Added support of ignore_index to Dice & Jaccard losses.

Improved Lovasz loss to work in fp16 mode.

Added option to override selected params in make_n_channel_input.

More Encoders, from timm package.

FPNFuse module not works on 2D, 3D and N-D inputs.

Added Global K-Max 2D pooling block.

Added Generalized mean pooling 2D block.

Added softmax_over_dim_X, argmax_over_dim_X shorthand functions for use in metrics to get soft/hard labels without using lambda functions.

Added helper visualization functions to add fancy header to image, stack images of different sizes.

Improved rendering of confusion matrix.

Catalyst goodies

Encoders & Losses are available in Catalyst registry

StopIfNanCallback

Added OutputDistributionCallback to log distribtion of predictions to TensorBoard.

Added UMAPCallback to visualize embedding space using UMAP in TensorBoard.

Breaking Changes

Renamed CudaTileMerger to TileMerger. TileMerger allows to specify target device explicitly.

tensor_from_rgb_image removed in favor of image_to_tensor.

Bug fixes & Improvements

Improve numeric stability of focal_loss_with_logits when reduction="sum"

Prevent NaN in FocalLoss when all elements are equal to ignore_index value.

A LOT of type hints.

Source code(tar.gz)
Source code(zip)
0.4.0(Aug 19, 2020)
New features

Memory-efficient Swish and Mish activation functions (Credits goes to http://github.com/rwightman/pytorch-image-models)

Refactor EfficientNet encoders (no pretrained weights yet)

Fixes

Fixed incorrect default value for ignore_index in SoftCrossEntropyLoss

Breaking changes

All catalyst-related utils updated to be compatible with Catalyst 20.8.2

Remove PIL package dependency

Improvements

More comments, more type hints

Source code(tar.gz)
Source code(zip)
0.3.2(Apr 28, 2020)
New features

Many helpful callbacks for Catalyst library: HyperParameterCallback, LossAdapter to name a few.

New losses for deep model supervision (Helpful, when size of target and output mask are different)

Stacked Hourglass encoder

Context Aggregation Network decoder

Breaking Changes

ABN module will now resolve as nn.Sequential(BatchNorm2d, Activation) instead of a hand-crafted module. This enables easier conversion of batch normalization modules to the nn.SyncBatchNorm.

Almost every Encoder/Decoder implementation has been refactored for better clarity and flexibility. Please double-check your pipelines.

Important bugfixes

Improved numerical stability of Dice / Jaccard losses (Using log_sigmoid() + exp() instead of plain sigmoid() )

Other

A lots of comments for functions and modules

Code cleanup, thanks for DeepSource

Type annotations for modules and functions

Update of README

Source code(tar.gz)
Source code(zip)
0.3.1(Feb 25, 2020)
Fixes

Fixed bug in computation IoU metric in binary_dice_iou_score function

Fixed incorrect default value in SoftCrossEntropyLoss #38

Improvements

Function draw_binary_segmentation_predictions now has parameter image_format (rgb|bgr|gray) to specify format of the image to visualize correctly images in TB

More type annotations across the codebase

New features

New visualization function draw_multilabel_segmentation_predictions

Source code(tar.gz)
Source code(zip)
0.3.0(Jan 17, 2020)
Pytorch Toolbel 0.3.0

This release has a huge set of new features, bugfixes and breaking changes. So be careful, when upgrading. pip install pytorch-toolbelt==0.3.0

New features

Encoders

HRNetV2

DenseNets

EfficientNet

Encoder class has change_input_channels method to change number of channels in input image

New losses

BCELoss with support of ignore_index

SoftBCELoss (Label smoothing loss for binary case with support of ignore_index)

SoftCrossEntropyLoss (Label smoothing loss for multiclass case with support of ignore_index)

Catalyst goodies

Online pseudolabeling callback

Training signal annealing callback

Other

New activation functions support in ABN block: Swish, Mish, HardSigmoid

New decoders (Unet, FPN, DeeplabV3, PPM) to simplify creation of segmentation models

CREDITS.md to include all the references to code/articles. Existing list is definitely not complete, so feel free to make PR's

Object context block from OCNet

API changes

Focal loss now supports normalized focal loss and reduced focal loss extensions.

Optimize computation of pyramid weight matrix #34

Default value align_corners=False in F.interpolate when doing bilinear upsampling.

Bugfixes

Fix missing call to batch normalization block in FPNBottleneckBN

Fix numerical stability for DiceLoss and JaccardLoss when log_loss=True

Fix numerical stability when computing normalized focal loss

Source code(tar.gz)
Source code(zip)
0.2.1(Oct 7, 2019)
New features

Added normalized focal loss

Bugfixes

Fixed wrong shape of intermediate layers of DenseNet

Source code(tar.gz)
Source code(zip)
0.2.0(Oct 4, 2019)
PyTorch Toolbelt 0.2.0

This release dedicated to housekeeping work. Dice/IoU metrics and losses have been redesigned to reduce amount of duplicated code and bring more clarity. Code is now auto-formatted using Black.

pip install pytorch_toolbelt==0.2.0

Catalyst contrib

Refactor Dice/IoU loss into single metric IoUMetricsCallback with a few cool features: metric="dice|jaccard" to choose what metric should be used; mode=binary|multiclass|multilabel to specify problem type (binary, multiclass or multi-label segmentation)'; classes_of_interest=[1,2,4] to select for which set of classes metric should be computed and nan_score_on_empty=False to compute Dice Accuracy (Counts as a 1.0 if both y_true and y_pred are empty; 0.0 if y_pred is not empty).

Added L-p regularization callback to apply L1 and L2 regularization to model with support of regularization strength scheduling.

Losses

Refactor DiceLoss/JaccardLoss losses in a same fashion as metrics.

Models

Add Densenet encoders

Bugfix: Fix missing BN+Relu in UNetDecoder

Global pooling modules can squeeze spatial channel dimensions if flatten=True.

Misc

Add more unit tests

Code-style is now managed with Black

to_numpy now supports int, float scalar types

Source code(tar.gz)
Source code(zip)
0.1.4(Sep 12, 2019)
PyTorch 0.1.4

Minor release to update Catalyst contrib modules to latest Catalyst (requires catalyst>=19.8)

Source code(tar.gz)
Source code(zip)
0.1.3(Jul 24, 2019)
PyTorch Toolbelt 0.1.3

Added ignore_index for focal loss

Added ignore_index to some metrics for Catalyst

Added tif extension for find_images_in_dir

Source code(tar.gz)
Source code(zip)
0.1.1(Jun 29, 2019)
New functionality / breaking changes

Added visualization functions to render best/worst batches for binary and semantic segmentation.

JaccardScoreCallback now is a single callback for computing IoU for binary/multiclass/multilabel segmentation.

Added HFF module (Hierarchical feature fusion).

Added set_trainable function to enable/disabled training and batch-norm on module and it's childs.

RLE encoding/decoding (Hi, Kaggle)

API changes

rgb_image_from_tensor now accepts dtype parameters for returned image

Bugfixes

Fixed wrong implementation of UpsampleAddConv (There was extra residual connection)

Source code(tar.gz)
Source code(zip)
0.1.0(Jun 12, 2019)
New stuff:

EfficientNet

Multiscale TTA module

New activations: Swish, HardSwish, HardSigmoid

AGN module (Activated Group Norm), mimicks ABN

Changes:

SpatialGate2d now accepts squeeze_channels for explicit number of squeeze channels.

Misc

Code formatting

Source code(tar.gz)
Source code(zip)
0.0.9(Jun 3, 2019)
Refactoring of activation functions factory method (for upcoming model builder)

Cosmetic changes in logging

Source code(tar.gz)
Source code(zip)
0.0.8(May 19, 2019)
Global pooling, SCSE module and MobileNetV3 encoders are not ONNX and CoreML friendly.

Refactored FPN module for more flexible interpolate_add tuning (can use any module with two inputs)

Source code(tar.gz)
Source code(zip)
0.0.7(May 8, 2019)

Added MobileNetV3 encoder (implementation credits to https://github.com/Randl/MobileNetV3-pytorch)
Source code(tar.gz)
Source code(zip)
0.0.6(May 6, 2019)
New features

Added WiderResNet & WiderResNetA2 encoders (https://github.com/mapillary/inplace_abn)

Added implementation of reduced focal loss (https://arxiv.org/abs/1903.01347)

Source code(tar.gz)
Source code(zip)
0.0.5(Apr 26, 2019)
Changes

Added 10-Crop TTA (https://github.com/BloodAxe/pytorch-toolbelt/issues/4)

Added unit tests for TTA functions

Added freeze_bn function to freeze all BN layers in a model

Rename unpad_tensor to unpad_image_tensor to mimick pad_image_tensor

Bugfixes

Fixed bug in d4_image2mask

Source code(tar.gz)
Source code(zip)
0.0.4(May 6, 2019)
API Changes

Refactored TTA interface

Source code(tar.gz)
Source code(zip)
0.0.3(May 6, 2019)

Initial release
Source code(tar.gz)
Source code(zip)

Owner

Eugene Khvedchenya

AI/ML Advisor, Entrepreneur, Kaggle Master. Author of pytorch-toolbelt. Core maintainer of albumentations. Catalyst contributor.

GitHub Repository

PyTorch extensions for fast R&D prototyping and Kaggle farming

Related tags

Overview

Pytorch-toolbelt

What's inside

Why

Installation

How do I ...

Model creation

Create Encoder-Decoder U-Net model

Create Encoder-Decoder FPN model with pretrained encoder

Change number of input channels for the Encoder

Misc

Count number of parameters in encoder/decoder and other modules

Compose multiple losses

TTA / Inferencing

Apply Test-time augmentation (TTA) for the model

Inference on huge images:

Advanced examples

Citation

Comments

🐛 Bug

Releases(0.6.2)

0.6.2(Dec 25, 2022)

0.6.1(Oct 25, 2022)

Pytorch Toolbelt 0.6.1

0.6.0(Oct 20, 2022)

Breaking Changes

0.5.3(Oct 20, 2022)

Bugfixes

New Stuff

Improvements

0.5.2(Aug 26, 2022)

BugFixes

New API

Other

0.5.1(Jun 27, 2022)

New API

Improvements

0.5.0(Mar 10, 2022)

Version 0.5.0

New features

Bug Fixes

Breaking changes

Fixes & Improvements (NO BC)

0.4.4(Aug 12, 2021)

New features

Bugfixes

Breaking Changes

Other Improvements

0.4.3(Apr 2, 2021)

PyTorch Toolbelt 0.4.3

Modules

Utils

TTA & Ensembling

Datasets

Other

0.4.2(Mar 3, 2021)

Breaking Changes

New features

Bugfixes

0.4.1(Jan 14, 2021)

New features

Catalyst goodies

Breaking Changes

Bug fixes & Improvements

0.4.0(Aug 19, 2020)

New features

Fixes

Breaking changes

Improvements

0.3.2(Apr 28, 2020)

New features

Breaking Changes

Important bugfixes

Other

0.3.1(Feb 25, 2020)

Fixes

Improvements

New features