Accelerated deep learning R&D

Overview

Catalyst logo

Accelerated deep learning R&D

CodeFactor Pipi version Docs PyPI Status

Twitter Telegram Slack Github contributors

codestyle catalyst catalyst-cv catalyst-nlp

python python python

os os os

PyTorch framework for Deep Learning research and development. It focuses on reproducibility, rapid experimentation, and codebase reuse so you can create something new rather than write another regular train loop.
Break the cycle - use the Catalyst!

Project manifest. Part of PyTorch Ecosystem. Part of Catalyst Ecosystem:

  • Alchemy - experiments logging & visualization
  • Catalyst - accelerated deep learning R&D
  • Reaction - convenient deep learning models serving

Catalyst at AI Landscape.


Getting started

pip install -U catalyst
import os
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST

model = torch.nn.Linear(28 * 28, 10)
optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
    "valid": DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32),
}

class CustomRunner(dl.Runner):

    def predict_batch(self, batch):
        # model inference step
        return self.model(batch[0].to(self.device).view(batch[0].size(0), -1))

    def _handle_batch(self, batch):
        # model train/valid step
        x, y = batch
        y_hat = self.model(x.view(x.size(0), -1))

        loss = F.cross_entropy(y_hat, y)
        accuracy01, accuracy03 = metrics.accuracy(y_hat, y, topk=(1, 3))
        self.batch_metrics.update(
            {"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03}
        )

        if self.is_train_loader:
            loss.backward()
            self.optimizer.step()
            self.optimizer.zero_grad()

runner = CustomRunner()
# model training
runner.train(
    model=model,
    optimizer=optimizer,
    loaders=loaders,
    logdir="./logs",
    num_epochs=5,
    verbose=True,
    load_best_on_end=True,
)
# model inference
for prediction in runner.predict_loader(loader=loaders["valid"]):
    assert prediction.detach().cpu().numpy().shape[-1] == 10
# model tracing
traced_model = runner.trace(loader=loaders["valid"])

Step by step guide

  1. Start with Catalyst 101 — Accelerated PyTorch introduction.
  2. Check minimal examples.
  3. Try notebook tutorials with Google Colab.
  4. Read blogposts with use-cases and guides.
  5. Learn machine learning with our "Deep Learning with Catalyst" course.
  6. If you would like to contribute to the project, follow our contribution guidelines.
  7. If you want to support the project, feel free to donate on patreon page or write us with your proposals.
  8. And do not forget to join our slack for collaboration.

Table of Contents

Overview

Catalyst helps you write compact but full-featured Deep Learning pipelines in a few lines of code. You get a training loop with metrics, early-stopping, model checkpointing and other features without the boilerplate.

Installation

Common installation:

pip install -U catalyst
Specific versions with additional requirements

pip install catalyst[ml]         # installs ML-based Catalyst
pip install catalyst[cv]         # installs CV-based Catalyst
pip install catalyst[nlp]        # installs NLP-based Catalyst
pip install catalyst[tune]       # installs Catalyst+Optuna
pip install catalyst[ecosystem]  # installs Catalyst.Ecosystem
# master version installation
pip install git+https://github.com/catalyst-team/[email protected] --upgrade

Catalyst is compatible with: Python 3.6+. PyTorch 1.1+.
Tested on Ubuntu 16.04/18.04/20.04, macOS 10.15, Windows 10 and Windows Subsystem for Linux.

Minimal Examples

ML - linear regression

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst.dl import SupervisedRunner

# data
num_samples, num_features = int(1e4), int(1e1)
X, y = torch.rand(num_samples, num_features), torch.rand(num_samples)
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [3, 6])

# model training
runner = SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=8,
    verbose=True,
)

ML - multiclass classification

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, ) * num_classes).to(torch.int64)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2])

# model training
runner = dl.SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    callbacks=[dl.AccuracyCallback(num_classes=num_classes)]
)

ML - multilabel classification

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, num_classes) > 0.5).to(torch.float32)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes)
criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2])

# model training
runner = dl.SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    callbacks=[dl.MultiLabelAccuracyCallback(threshold=0.5)]
)

CV - MNIST classification

import os
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST

model = torch.nn.Linear(28 * 28, 10)
optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
    "valid": DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32),
}

class CustomRunner(dl.Runner):

    def _handle_batch(self, batch):
        x, y = batch
        y_hat = self.model(x.view(x.size(0), -1))

        loss = F.cross_entropy(y_hat, y)
        accuracy01, accuracy03, accuracy05 = metrics.accuracy(y_hat, y, topk=(1, 3, 5))
        self.batch_metrics = {
            "loss": loss,
            "accuracy01": accuracy01,
            "accuracy03": accuracy03,
            "accuracy05": accuracy05,
        }
        
        if self.is_train_loader:
            loss.backward()
            self.optimizer.step()
            self.optimizer.zero_grad()

runner = CustomRunner()
runner.train(
    model=model, 
    optimizer=optimizer, 
    loaders=loaders, 
    verbose=True,
)

CV - classification with AutoEncoder

import os
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST

class ClassifyAE(nn.Module):

    def __init__(self, in_features, hid_features, out_features):
        super().__init__()
        self.encoder = nn.Sequential(nn.Linear(in_features, hid_features), nn.Tanh())
        self.decoder = nn.Sequential(nn.Linear(hid_features, in_features), nn.Sigmoid())
        self.clf = nn.Linear(hid_features, out_features)

    def forward(self, x):
        z = self.encoder(x)
        y_hat = self.clf(z)
        x_ = self.decoder(z)
        return y_hat, x_

model = ClassifyAE(28 * 28, 128, 10)
optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
    "valid": DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32),
}

class CustomRunner(dl.Runner):

    def _handle_batch(self, batch):
        x, y = batch
        x = x.view(x.size(0), -1)
        y_hat, x_ = self.model(x)

        loss_clf = F.cross_entropy(y_hat, y)
        loss_ae = F.mse_loss(x_, x)
        loss = loss_clf + loss_ae
        accuracy01, accuracy03, accuracy05 = metrics.accuracy(y_hat, y, topk=(1, 3, 5))
        self.batch_metrics = {
            "loss_clf": loss_clf,
            "loss_ae": loss_ae,
            "loss": loss,
            "accuracy01": accuracy01,
            "accuracy03": accuracy03,
            "accuracy05": accuracy05,
        }

        if self.is_train_loader:
            loss.backward()
            self.optimizer.step()
            self.optimizer.zero_grad()

runner = CustomRunner()
runner.train(
    model=model,
    optimizer=optimizer,
    loaders=loaders,
    verbose=True,
)

CV - classification with Variational AutoEncoder

import os
import numpy as np
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST

LOG_SCALE_MAX = 2
LOG_SCALE_MIN = -10

def normal_sample(loc, log_scale):
    scale = torch.exp(0.5 * log_scale)
    return loc + scale * torch.randn_like(scale)

class ClassifyVAE(torch.nn.Module):

    def __init__(self, in_features, hid_features, out_features):
        super().__init__()
        self.encoder = nn.Linear(in_features, hid_features * 2)
        self.decoder = nn.Sequential(nn.Linear(hid_features, in_features), nn.Sigmoid())
        self.clf = nn.Linear(hid_features, out_features)

    def forward(self, x, deterministic=False):
        z = self.encoder(x)
        bs, z_dim = z.shape

        loc, log_scale = z[:, :z_dim // 2], z[:, z_dim // 2:]
        log_scale = torch.clamp(log_scale, LOG_SCALE_MIN, LOG_SCALE_MAX)

        z_ = loc if deterministic else normal_sample(loc, log_scale)
        z_ = z_.view(bs, -1)
        x_ = self.decoder(z_)

        y_hat = self.clf(z_)

        return y_hat, x_, loc, log_scale

model = ClassifyVAE(28 * 28, 64, 10)
optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
    "valid": DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32),
}

class CustomRunner(dl.Runner):

    def _handle_batch(self, batch):
        x, y = batch
        x = x.view(x.size(0), -1)
        y_hat, x_, loc, log_scale = self.model(x, deterministic=not self.is_train_loader)

        loss_clf = F.cross_entropy(y_hat, y)
        loss_ae = F.mse_loss(x_, x)
        loss_kld = (-0.5 * torch.sum(1 + log_scale - loc.pow(2) - log_scale.exp(), dim=1)).mean()
        loss = loss_clf + loss_ae + loss_kld
        accuracy01, accuracy03, accuracy05 = metrics.accuracy(y_hat, y, topk=(1, 3, 5))
        self.batch_metrics = {
            "loss_clf": loss_clf,
            "loss_ae": loss_ae,
            "loss_kld": loss_kld,
            "loss": loss,
            "accuracy01": accuracy01,
            "accuracy03": accuracy03,
            "accuracy05": accuracy05,
        }

        if self.is_train_loader:
            loss.backward()
            self.optimizer.step()
            self.optimizer.zero_grad()

runner = CustomRunner()
runner.train(
    model=model,
    optimizer=optimizer,
    loaders=loaders,
    verbose=True,
)

CV - segmentation with classification auxiliary task

import os
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST

class ClassifyUnet(nn.Module):

    def __init__(self, in_channels, in_hw, out_features):
        super().__init__()
        self.encoder = nn.Sequential(nn.Conv2d(in_channels, in_channels, 3, 1, 1), nn.Tanh())
        self.decoder = nn.Conv2d(in_channels, in_channels, 3, 1, 1)
        self.clf = nn.Linear(in_channels * in_hw * in_hw, out_features)

    def forward(self, x):
        z = self.encoder(x)
        z_ = z.view(z.size(0), -1)
        y_hat = self.clf(z_)
        x_ = self.decoder(z)
        return y_hat, x_

model = ClassifyUnet(1, 28, 10)
optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
    "valid": DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32),
}

class CustomRunner(dl.Runner):

    def _handle_batch(self, batch):
        x, y = batch
        x_noise = (x + torch.rand_like(x)).clamp_(0, 1)
        y_hat, x_ = self.model(x_noise)

        loss_clf = F.cross_entropy(y_hat, y)
        iou = metrics.iou(x_, x).mean()
        loss_iou = 1 - iou
        loss = loss_clf + loss_iou
        accuracy01, accuracy03, accuracy05 = metrics.accuracy(y_hat, y, topk=(1, 3, 5))
        self.batch_metrics = {
            "loss_clf": loss_clf,
            "loss_iou": loss_iou,
            "loss": loss,
            "iou": iou,
            "accuracy01": accuracy01,
            "accuracy03": accuracy03,
            "accuracy05": accuracy05,
        }
        
        if self.is_train_loader:
            loss.backward()
            self.optimizer.step()
            self.optimizer.zero_grad()

runner = CustomRunner()
runner.train(
    model=model, 
    optimizer=optimizer, 
    loaders=loaders, 
    verbose=True,
)

CV - MNIST with Metric Learning

Open In Colab

from torch.optim import Adam
from torch.utils.data import DataLoader

from catalyst import data, dl, utils
from catalyst.contrib import datasets, models, nn
import catalyst.contrib.data.cv.transforms.torch as t


# 1. train and valid datasets
dataset_root = "."
transforms = t.Compose([t.ToTensor(), t.Normalize((0.1307,), (0.3081,))])

dataset_train = datasets.MnistMLDataset(root=dataset_root, download=True, transform=transforms)
sampler = data.BalanceBatchSampler(labels=dataset_train.get_labels(), p=5, k=10)
train_loader = DataLoader(dataset=dataset_train, sampler=sampler, batch_size=sampler.batch_size)

dataset_val = datasets.MnistQGDataset(root=dataset_root, transform=transforms, gallery_fraq=0.2)
val_loader = DataLoader(dataset=dataset_val, batch_size=1024)

# 2. model and optimizer
model = models.SimpleConv(features_dim=16)
optimizer = Adam(model.parameters(), lr=0.001)

# 3. criterion with triplets sampling
sampler_inbatch = data.HardTripletsSampler(norm_required=False)
criterion = nn.TripletMarginLossWithSampler(margin=0.5, sampler_inbatch=sampler_inbatch)

# 4. training with catalyst Runner
callbacks = [
    dl.ControlFlowCallback(dl.CriterionCallback(), loaders="train"),
    dl.ControlFlowCallback(dl.CMCScoreCallback(topk_args=[1]), loaders="valid"),
    dl.PeriodicLoaderCallback(valid=100),
]

runner = dl.SupervisedRunner(device=utils.get_device())
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    callbacks=callbacks,
    loaders={"train": train_loader, "valid": val_loader},
    minimize_metric=False,
    verbose=True,
    valid_loader="valid",
    num_epochs=200,
    main_metric="cmc01",
)   

GAN - MNIST, flatten version

import os
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST
from catalyst.contrib.nn.modules import Flatten, GlobalMaxPool2d, Lambda

latent_dim = 128
generator = nn.Sequential(
    # We want to generate 128 coefficients to reshape into a 7x7x128 map
    nn.Linear(128, 128 * 7 * 7),
    nn.LeakyReLU(0.2, inplace=True),
    Lambda(lambda x: x.view(x.size(0), 128, 7, 7)),
    nn.ConvTranspose2d(128, 128, (4, 4), stride=(2, 2), padding=1),
    nn.LeakyReLU(0.2, inplace=True),
    nn.ConvTranspose2d(128, 128, (4, 4), stride=(2, 2), padding=1),
    nn.LeakyReLU(0.2, inplace=True),
    nn.Conv2d(128, 1, (7, 7), padding=3),
    nn.Sigmoid(),
)
discriminator = nn.Sequential(
    nn.Conv2d(1, 64, (3, 3), stride=(2, 2), padding=1),
    nn.LeakyReLU(0.2, inplace=True),
    nn.Conv2d(64, 128, (3, 3), stride=(2, 2), padding=1),
    nn.LeakyReLU(0.2, inplace=True),
    GlobalMaxPool2d(),
    Flatten(),
    nn.Linear(128, 1)
)

model = {"generator": generator, "discriminator": discriminator}
optimizer = {
    "generator": torch.optim.Adam(generator.parameters(), lr=0.0003, betas=(0.5, 0.999)),
    "discriminator": torch.optim.Adam(discriminator.parameters(), lr=0.0003, betas=(0.5, 0.999)),
}
loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
}

class CustomRunner(dl.Runner):

    def _handle_batch(self, batch):
        real_images, _ = batch
        batch_metrics = {}
        
        # Sample random points in the latent space
        batch_size = real_images.shape[0]
        random_latent_vectors = torch.randn(batch_size, latent_dim).to(self.device)
        
        # Decode them to fake images
        generated_images = self.model["generator"](random_latent_vectors).detach()
        # Combine them with real images
        combined_images = torch.cat([generated_images, real_images])
        
        # Assemble labels discriminating real from fake images
        labels = torch.cat([
            torch.ones((batch_size, 1)), torch.zeros((batch_size, 1))
        ]).to(self.device)
        # Add random noise to the labels - important trick!
        labels += 0.05 * torch.rand(labels.shape).to(self.device)
        
        # Train the discriminator
        predictions = self.model["discriminator"](combined_images)
        batch_metrics["loss_discriminator"] = \
          F.binary_cross_entropy_with_logits(predictions, labels)
        
        # Sample random points in the latent space
        random_latent_vectors = torch.randn(batch_size, latent_dim).to(self.device)
        # Assemble labels that say "all real images"
        misleading_labels = torch.zeros((batch_size, 1)).to(self.device)
        
        # Train the generator
        generated_images = self.model["generator"](random_latent_vectors)
        predictions = self.model["discriminator"](generated_images)
        batch_metrics["loss_generator"] = \
          F.binary_cross_entropy_with_logits(predictions, misleading_labels)
        
        self.batch_metrics.update(**batch_metrics)

runner = CustomRunner()
runner.train(
    model=model, 
    optimizer=optimizer,
    loaders=loaders,
    callbacks=[
        dl.OptimizerCallback(
            optimizer_key="generator", 
            metric_key="loss_generator"
        ),
        dl.OptimizerCallback(
            optimizer_key="discriminator", 
            metric_key="loss_discriminator"
        ),
    ],
    main_metric="loss_generator",
    num_epochs=20,
    verbose=True,
    logdir="./logs_gan",
)

ML - multiclass classification (fp16 training version)

Open In Colab

# pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" git+https://github.com/NVIDIA/apex
import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, ) * num_classes).to(torch.int64)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2])

# model training
runner = dl.SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    callbacks=[dl.AccuracyCallback(num_classes=num_classes)],
    fp16=True,
)

ML - multiclass classification (advanced fp16 training version)

Open In Colab

# pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" git+https://github.com/NVIDIA/apex
import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, ) * num_classes).to(torch.int64)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2])

# model training
runner = dl.SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    callbacks=[dl.AccuracyCallback(num_classes=num_classes)],
    fp16=dict(apex=True, opt_level="O1"),
)

ML - Linear Regression (distributed training version)

#!/usr/bin/env python
import torch
from torch.utils.data import TensorDataset
from catalyst.dl import SupervisedRunner, utils

def datasets_fn(num_features: int):
    X = torch.rand(int(1e4), num_features)
    y = torch.rand(X.shape[0])
    dataset = TensorDataset(X, y)
    return {"train": dataset, "valid": dataset}

def train():
    num_features = int(1e1)
    # model, criterion, optimizer, scheduler
    model = torch.nn.Linear(num_features, 1)
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters())
    scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [3, 6])

    runner = SupervisedRunner()
    runner.train(
        model=model,
        datasets={
            "batch_size": 32,
            "num_workers": 1,
            "get_datasets_fn": datasets_fn,
            "num_features": num_features,  # will be passed to datasets_fn
        },
        criterion=criterion,
        optimizer=optimizer,
        scheduler=scheduler,
        logdir="./logs/example_distributed_ml",
        num_epochs=8,
        verbose=True,
        distributed=False,
    )

utils.distributed_cmd_run(train)

CV - classification with AutoEncoder (distributed training version)

#!/usr/bin/env python
import os
import torch
from torch import nn
from torch.nn import functional as F
from catalyst import dl, metrics, utils
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST

class ClassifyAE(nn.Module):

    def __init__(self, in_features, hid_features, out_features):
        super().__init__()
        self.encoder = nn.Sequential(nn.Linear(in_features, hid_features), nn.Tanh())
        self.decoder = nn.Linear(hid_features, in_features)
        self.clf = nn.Linear(hid_features, out_features)

    def forward(self, x):
        z = self.encoder(x)
        y_hat = self.clf(z)
        x_ = self.decoder(z)
        return y_hat, x_

class CustomRunner(dl.Runner):

    def _handle_batch(self, batch):
        x, y = batch
        x = x.view(x.size(0), -1)
        y_hat, x_ = self.model(x)

        loss_clf = F.cross_entropy(y_hat, y)
        loss_ae = F.mse_loss(x_, x)
        loss = loss_clf + loss_ae
        accuracy01, accuracy03, accuracy05 = metrics.accuracy(y_hat, y, topk=(1, 3, 5))
        self.batch_metrics = {
            "loss_clf": loss_clf,
            "loss_ae": loss_ae,
            "loss": loss,
            "accuracy01": accuracy01,
            "accuracy03": accuracy03,
            "accuracy05": accuracy05,
        }

        if self.is_train_loader:
            loss.backward()
            self.optimizer.step()
            self.optimizer.zero_grad()

def datasets_fn():
    dataset = MNIST(os.getcwd(), train=False, download=True, transform=ToTensor())
    return {"train": dataset, "valid": dataset}

def train():
    model = ClassifyAE(28 * 28, 128, 10)
    optimizer = torch.optim.Adam(model.parameters(), lr=0.02)

    runner = CustomRunner()
    runner.train(
        model=model,
        optimizer=optimizer,
        datasets={
            "batch_size": 32,
            "num_workers": 1,
            "get_datasets_fn": datasets_fn,
        },
        logdir="./logs/distributed_ae",
        num_epochs=8,
        verbose=True,
    )

utils.distributed_cmd_run(train)

ML - multiclass classification (TPU version)

Open In Colab

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl, utils

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, ) * num_classes).to(torch.int64)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# device (TPU > GPU > CPU)
device = utils.get_device()  # <--------- TPU device

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes).to(device)
criterion = torch.nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.Adam(model.parameters())

# model training
runner = dl.SupervisedRunner(device=device)
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    callbacks=[dl.AccuracyCallback(num_classes=num_classes)]
)

AutoML - hyperparameters optimization with Optuna

Open In Colab

import os
import optuna
import torch
from torch import nn
from torch.utils.data import DataLoader
from catalyst import dl
from catalyst.contrib.data.cv import ToTensor
from catalyst.contrib.datasets import MNIST
from catalyst.contrib.nn import Flatten
    

def objective(trial):
    lr = trial.suggest_loguniform("lr", 1e-3, 1e-1)
    num_hidden = int(trial.suggest_loguniform("num_hidden", 32, 128))

    loaders = {
        "train": DataLoader(MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32),
        "valid": DataLoader(MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32),
    }
    model = nn.Sequential(
        Flatten(), nn.Linear(784, num_hidden), nn.ReLU(), nn.Linear(num_hidden, 10)
    )
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    runner = dl.SupervisedRunner()
    runner.train(
        model=model,
        loaders=loaders,
        criterion=criterion,
        optimizer=optimizer,
        callbacks=[
            dl.OptunaCallback(trial),
            dl.AccuracyCallback(num_classes=10),
        ],
        num_epochs=10,
        main_metric="accuracy01",
        minimize_metric=False,
    )
    return runner.best_valid_metrics[runner.main_metric]

study = optuna.create_study(
    direction="maximize",
    pruner=optuna.pruners.MedianPruner(
        n_startup_trials=1, n_warmup_steps=0, interval_steps=1
    ),
)
study.optimize(objective, n_trials=10, timeout=300)
print(study.best_value, study.best_params)

Features

  • Universal train/inference loop.
  • Configuration files for model/data hyperparameters.
  • Reproducibility – all source code and environment variables will be saved.
  • Callbacks – reusable train/inference pipeline parts with easy customization.
  • Training stages support.
  • Deep Learning best practices - SWA, AdamW, Ranger optimizer, OneCycle, and more.
  • Developments best practices - fp16 support, distributed training, slurm support.

Structure

  • callbacks - a variety of callbacks for your train-loop customization.
  • contrib - additional modules contributed by Catalyst users.
  • core - framework core with main abstractions - Experiment, Runner and Callback.
  • data - useful tools and scripts for data processing.
  • dl - entrypoint for your deep learning experiments.
  • experiments - a number of useful experiments extensions for Notebook and Config API.
  • metrics – classic ML and CV/NLP/RecSys metrics.
  • registry - Catalyst global registry for Config API.
  • runners - runners extensions for different deep learning tasks.
  • tools - extra tools for Deep Learning research, class-based helpers.
  • utils - typical utils for Deep Learning research, function-based helpers.

Tests

All Catalyst code, features and pipelines are fully tested with our own catalyst-codestyle.

In fact, we train a number of different models for various of tasks - image classification, image segmentation, text classification, GANs training and much more. During the tests, we compare their convergence metrics in order to verify the correctness of the training procedure and its reproducibility.

As a result, Catalyst provides fully tested and reproducible best practices for your deep learning research.

Catalyst

Tutorials

Blogposts

Docs

Projects

Examples, notebooks and starter kits

Competitions

Paper implementations

Tools and pipelines

Talks

Community

Contribution guide

We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion. If you plan to contribute new features, utility functions or extensions, please first open an issue and discuss the feature with us.

User feedback

We have created [email protected] for "user feedback".

  • If you like the project and want to say thanks, this the right place.
  • If you would like to start a collaboration between your team and Catalyst team to do better Deep Learning R&D - you are always welcome.
  • If you just don't like Github issues and this ways suits you better - feel free to email us.
  • Finally, if you do not like something, please, share it with us and we can see how to improve it.

We appreciate any type of feedback. Thank you!

Acknowledgments

Since the beginning of the development of the Сatalyst, a lot of people have influenced it in a lot of different ways.

Catalyst.Team

Catalyst - Metric Learning team

Catalyst.Contributors

Catalyst.Friends

Trusted by

Supported by

Citation

Please use this bibtex if you want to cite this repository in your publications:

@misc{catalyst,
    author = {Kolesnikov, Sergey},
    title = {Accelerated deep learning R&D},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/catalyst-team/catalyst}},
}
Comments
  • Version/19.03

    Version/19.03

    catalyst-dl 19.02 proposal

    main goal:

    from catalyst import Runner
    from expdir import MyExperiment
    
    Runner.run(MyExperiment) (train/infer)
    

    typical run:

    mode (train/infer)
    	stage
    		epoch
    			loader
    				batch
    

    during stage - model/etc are the same, between stages - can be easily replaced

    main entities:

    • Registry - Factory for registering user extentions
    • Experiment - keeper of the config, knows how to create model / etc, but does not keep them
    • State - all infos about what is in the experiment now, in the current stage
    • Runner - runner, responsible for main logic
    • Callbacks - additional user extentions for changing runner’s work a bit
    WIP 
    opened by Scitator 43
  • WandB batch metrics logging error

    WandB batch metrics logging error

    🐛 Bug Report

    In wandb all batch metrics are logged as single value per epoch.

    Expected behavior

    Batch metrics must be logged once per step.

    Catalyst version: 21.7
    

    Additional context

    The problem is here:

    https://github.com/catalyst-team/catalyst/blob/master/catalyst/loggers/wandb.py#L115

    Step must be equal to global_sample_step, not global_epoch_step.

    bug help wanted 
    opened by ivan-chai 20
  • Evaluate for Runner

    Evaluate for Runner

    🚀 Feature Request

    The evaluate_loader method for Python API. Similar to .train and .predict_loader

    Motivation

    Proposal

    Possible use case

    import os
    from torch import nn, optim
    from torch.utils.data import DataLoader
    from catalyst import dl, utils
    from catalyst.data.transforms import ToTensor
    from catalyst.contrib.datasets import MNIST
    
    model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.02)
    
    loaders = {
        "train": DataLoader(
            MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32
        ),
        "valid": DataLoader(
            MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32
        ),
    }
    
    runner = dl.SupervisedRunner(
        input_key="features", output_key="logits", target_key="targets", loss_key="loss"
    )
    # model training
    runner.train(
        model=model,
        criterion=criterion,
        optimizer=optimizer,
        loaders=loaders,
        num_epochs=1,
        callbacks=[
            dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 3, 5)),
            dl.PrecisionRecallF1SupportCallback(
                input_key="logits", target_key="targets", num_classes=10
            ),
            dl.AUCCallback(input_key="logits", target_key="targets"),
            # catalyst[ml] required ``pip install catalyst[ml]``
            # dl.ConfusionMatrixCallback(input_key="logits", target_key="targets", num_classes=10),
        ],
        logdir="./logs",
        valid_loader="valid",
        valid_metric="loss",
        minimize_valid_metric=True,
        verbose=True,
        load_best_on_end=True,
    )
    
    loader_metrics = runner.evaluate_loader(
        loader=loaders["valid"]), 
        callbacks=[
            dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 3, 5)),
            dl.PrecisionRecallF1SupportCallback(
                input_key="logits", target_key="targets", num_classes=10
            ),
        ])
    

    Alternatives

    The whole method could be easily done with the .train approach, but for a more user-friendly API – why should not we add a simplified alias?

    Additional context

    Checklist

    • [x] feature proposal description
    • [x] motivation
    • [x] extra proposal context / proposal alternatives review

    FAQ

    Please review the FAQ before submitting an issue:

    enhancement help wanted good first issue 
    opened by Scitator 17
  • Naming inconsistency

    Naming inconsistency

    Describe the bug I found that some names agruments in framework aren't consistent. So for example:

    class SupervisedRunner(Runner):
        """Runner for experiments with supervised model."""
    
        _experiment_fn: Callable = SupervisedExperiment
    
        def __init__(
            self,
            model: Model = None,
            device: Device = None,
            input_key: Any = "features", 
            output_key: Any = "logits",
            input_target_key: str = "targets", # This argument corresponds to input_key argument in CriterionCallback
        ):
    
    class CriterionCallback(_MetricCallback):
        """Callback for that measures loss with specified criterion."""
    
        def __init__(
            self,
            input_key: Union[str, List[str], Dict[str, str]] = "targets", # This argument corresponds to input_target_key argument in SupervisedRunner
            output_key: Union[str, List[str], Dict[str, str]] = "logits",
            prefix: str = "loss",
            criterion_key: str = None,
            multiplier: float = 1.0,
            **metric_kwargs,
        ):
    

    To Reproduce Steps to reproduce the behavior:

    1. Check files: catalyst.core.callback.metric.py and catalyst.dl.runner.supervised.py

    Expected behavior I expect that names would be consistent across the framework and means the same

    enhancement help wanted good first issue question wontfix 
    opened by ogvalt 17
  • Update ce.py

    Update ce.py

    Description

    Implementation of Symmetric Cross Entropy

    Related Issue

    https://github.com/catalyst-team/catalyst/issues/479

    Type of Change

    • [ ] Examples / docs / tutorials / contributors update
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] Improvement (non-breaking change which improves an existing feature)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Checklist

    • [x] I have read the Code of Conduct document.
    • [x] I have read the Contributing guide.
    • [ ] I have checked the code-style using make check-style.
    • [x] I have written the docstring in Google format for all the methods and classes that I used.
    • [ ] I have checked the docs using make check-docs.
    enhancement good first issue WIP 
    opened by KyloRen1 17
  • Triplet loss epic

    Triplet loss epic

    best triplet loss ever

    https://github.com/adambielski/siamese-triplet https://github.com/andreasveit/triplet-network-pytorch https://github.com/CoinCheung/triplet-reid-pytorch https://discuss.pytorch.org/t/triplet-loss-in-pytorch/30634

    enhancement good first issue 
    opened by ermakovpetr 16
  • Add support for WandbLogger

    Add support for WandbLogger

    Before submitting (checklist)

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contribution guide?
    • [x] Did you check the code style? catalyst-make-codestyle && catalyst-check-codestyle (pip install -U catalyst-codestyle).
    • [x] Did you make sure to update the docs? We use Google format for all the methods and classes.
    • [x] Did you check the docs with make check-docs?
    • [x] Did you write any new necessary tests?
    • [x] Did you check that your code passes the unit tests pytest . ?
    • [x] Did you add your new functionality to the docs?
    • [x] Did you update the CHANGELOG?
    • [ ] Did you run colab minimal CI/CD with latest and minimal requirements?

    Description

    This PR adds support for WandbLogger that enables logging metrics and media to W&B dashboard

    Related Issue

    Type of Change

    • [ ] Examples / docs / tutorials / contributors update
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [ ] Improvement (non-breaking change which improves an existing feature)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    PR review

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Additional Deatils:

    • The minimum tests colab seemed stuck while running tests, but the tests passed on my machine. I'll update this thread with test results
    • I've made this draft PR as I still want to confirm the hyperparameter logging behavior. On running the test_finetune2.py train method, no hyperparameters are being logged.

    Test logs

    Code style --> catalyst-make-codestyle && catalyst-check-codestyle

    python3.8/site-packages/isort/settings.py:619: UserWarning: Failed to pull configuration information from /home/saksham/Desktop/catalyst/setup.cfg
      warn(f"Failed to pull configuration information from {potential_config_file}")
    Skipped 55 files
    python3.8/site-packages/isort/main.py:1000: UserWarning: W0501: The following deprecated CLI flags were used and ignored: --apply!
      warn(python3.8/site-packages/isort/main.py:1004: UserWarning: W0500: Please see the 5.0.0 Upgrade guide: https://pycqa.github.io/isort/docs/upgrade_guides/5.0.0/
      warn(
    All done! ✨ 🍰 ✨
    350 files left unchanged.
    python3.8/site-packages/isort/settings.py:619: UserWarning: Failed to pull configuration information from /home/catalyst/setup.cfg
      warn(f"Failed to pull configuration information from {potential_config_file}")
    Skipped 55 files
    All done! ✨ 🍰 ✨
    350 files would be left unchanged.
    Failed to pull configuration information from home/catalyst/setup.cfg
    0
    

    Docs check -->rm -rf ./builds; REMOVE_BUILDS=0 make check-docs

    reading sources... [100%] tutorials/ddp                                                                                                                                                              
    looking for now-outdated files... none found
    pickling environment... done
    checking consistency... done
    preparing documents... done
    writing output... [100%] tutorials/ddp                                                                                                                                                               
    generating indices...  genindex py-modindexdone
    highlighting module code... [100%] torch.utils.data.sampler                                                                                                                                          
    writing additional pages...  search/home/saksham/anaconda3/envs/catalyst_dev/lib/python3.8/site-packages/catalyst_sphinx_theme/search.html:21: RemovedInSphinx30Warning: To modify script_files in the theme is deprecated. Please insert a <script> tag directly in your theme instead.
      {% trans %}Please activate JavaScript to enable the search
    done
    copying static files... ... done
    copying extra files... done
    dumping search index in English (code: en)... done
    dumping object inventory... done
    build succeeded.
    
    The HTML pages are in builds.
    #### CODE: 0 ####
    

    Tests --> pytest .

    337 passed, 134 skipped, 2 xfailed, 93 warnings in 439.09s (0:07:19)
    

    @Scitator Let me know if I missed any steps here

    FAQ

    Please review the FAQ before submitting an issue:

    opened by AyushExel 15
  • updated dl_cpu(workflows)- For passing CI-Tests

    updated dl_cpu(workflows)- For passing CI-Tests

    Before submitting (checklist)

    • [ ] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [ ] Did you read the contribution guide?
    • [ ] Did you check the code style? catalyst-make-codestyle && catalyst-check-codestyle (pip install -U catalyst-codestyle).
    • [ ] Did you make sure to update the docs? We use Google format for all the methods and classes.
    • [ ] Did you check the docs with make check-docs?
    • [ ] Did you write any new necessary tests?
    • [ ] Did you check that your code passes the unit tests pytest . ?
    • [ ] Did you add your new functionality to the docs?
    • [ ] Did you update the CHANGELOG?

    Description

    Related Issue

    Type of Change

    • [ ] Examples / docs / tutorials / contributors update
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [ ] Improvement (non-breaking change which improves an existing feature)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    PR review

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    PS

    • [x] I know, that I could join slack for pull request discussion.

    Note

    Mentioned in comment of #1131. so that CI test will run properly

    opened by Atharva-Phatak 15
  • Fixed  OneCycleLRWithWarmup

    Fixed OneCycleLRWithWarmup

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contribution guide?
    • [x] Did you check the code style? catalyst-make-codestyle && catalyst-check-codestyle (pip install -U catalyst-codestyle). Not able to check its showing 'catalyst-make-codestyle' is not recognized as an internal or external command Please suggest me how to do this. I am in Windows 10 environment
    • [ ] Did you make sure to update the docs? We use Google format for all the methods and classes.
    • [x] Did you check the docs with make check-docs?
    • [x] Did you write any new necessary tests?
    • [ ] Did you add your new functionality to the docs?
    • [x] Did you update the CHANGELOG?
    • [x] You can use 'Login as guest' to see Teamcity build logs.

    Description

    OneCycleLRWithWarmup starts ahead of initial LR (does not start with init_lr)

    Related Issue

    https://github.com/catalyst-team/catalyst/issues/851

    Type of Change

    • [ ] Examples / docs / tutorials / contributors update
    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] Improvement (non-breaking change which improves an existing feature)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    PR review

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    opened by lokeshkvn 15
  • Fixed gradient tracking

    Fixed gradient tracking

    Description

    Fixed storing gradients in OptimizerCallback

    Related Issue

    Type of Change

    • [ ] Examples / docs / tutorials / contributors update
    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] Improvement (non-breaking change which improves an existing feature)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to change)

    Checklist

    • [x] I have read the Code of Conduct document.
    • [x] I have read the Contributing guide.
    • [x] I have checked the code-style using make check-codestyle.
    • [x] I have written tests for all new methods and classes that I created.
    • [x] I have written the docstring in Google format for all the methods and classes that I used.
    • [x] I have checked the docs using make check-docs.
    • [x] I have read I need to click 'Login as guest' to see Teamcity build logs.
    enhancement WIP 
    opened by pdanilov 15
  • Accumulate gradient

    Accumulate gradient

    I was trying to use the accumulate gradient feature but run into an error. The training works without the OptimizerCallback(accmulation_steps=2).

    runner.train(
        model=model,
        criterion=criterion,
        optimizer=optimizer,
        scheduler=scheduler,
        loaders=loaders,
        callbacks=[DiceCallback(), EarlyStoppingCallback(patience=5, min_delta=0.001), 
                                OptimizerCallback(accumulation_steps=2)],
        logdir=logdir,
        num_epochs=num_epochs,
        verbose=True
    )
    

    FYI, the error message:

    0/60 * Epoch (train): 0% 0/624 [00:00<?, ?it/s]

    TypeError Traceback (most recent call last) in 9 logdir=logdir, 10 num_epochs=num_epochs, ---> 11 verbose=True 12 )

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/runner/supervised.py in train(self, model, criterion, optimizer, loaders, logdir, callbacks, scheduler, resume, num_epochs, valid_loader, main_metric, minimize_metric, verbose, state_kwargs, checkpoint_data, fp16, monitoring_params, check) 195 monitoring_params=monitoring_params 196 ) --> 197 self.run_experiment(experiment, check=check) 198 199 def infer(

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in run_experiment(self, experiment, check) 229 except (Exception, KeyboardInterrupt) as ex: 230 self.state.exception = ex --> 231 self._run_event("exception") 232 233 return self

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in run_event(self, event) 100 101 if self.state is not None and hasattr(self.state, f"on{event}post"): --> 102 getattr(self.state, f"on{event}_post")() 103 104 @abstractmethod

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/state.py in on_exception_post(self) 183 def on_exception_post(self): 184 for logger in self.loggers.values(): --> 185 logger.on_exception(self) 186 187

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/callbacks/logging.py in on_exception(self, state) 194 195 if state.need_reraise_exception: --> 196 raise exception 197 198

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in run_experiment(self, experiment, check) 226 try: 227 for stage in self.experiment.stages: --> 228 self._run_stage(stage) 229 except (Exception, KeyboardInterrupt) as ex: 230 self.state.exception = ex

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in _run_stage(self, stage) 199 200 self._run_event("epoch_start") --> 201 self._run_epoch(loaders) 202 self._run_event("epoch_end") 203

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in _run_epoch(self, loaders) 186 self._run_event("loader_start") 187 with torch.set_grad_enabled(self.state.need_backward): --> 188 self._run_loader(loader) 189 self._run_event("loader_end") 190

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in _run_loader(self, loader) 148 149 for i, batch in enumerate(loader): --> 150 self._run_batch(batch) 151 152 self.state.timer.reset()

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in _run_batch(self, batch) 130 self.state.timer.stop("_timers/model_time") 131 self.state.timer.stop("_timers/batch_time") --> 132 self._run_event("batch_end") 133 134 def _run_loader(self, loader):

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/runner.py in run_event(self, event) 97 if self.callbacks is not None: 98 for callback in self.callbacks.values(): ---> 99 getattr(callback, f"on{event}")(self.state) 100 101 if self.state is not None and hasattr(self.state, f"on_{event}_post"):

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/callbacks/optimizer.py in on_batch_end(self, state) 117 return 118 --> 119 loss = self._get_loss(state) 120 121 self._accumulation_counter += 1

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/callbacks/optimizer.py in _get_loss(self, state) 91 92 def _get_loss(self, state) -> torch.Tensor: ---> 93 loss = state.get_key(key="loss", inner_key=self.loss_key) 94 95 if isinstance(loss, list):

    ~/.conda/envs/mmdet_cloud/lib/python3.6/site-packages/catalyst/dl/core/state.py in get_key(self, key, inner_key) 114 return getattr(self, key) 115 else: --> 116 return getattr(self, key)[inner_key] 117 118 def set_key(self, value, key, inner_key=None):

    TypeError: 'NoneType' object is not subscriptable

    bug 
    opened by wmmxk 14
  • No utils.initialization file

    No utils.initialization file

    🐛 Bug Report

    The initialization file under the utils folder does not exist in this repo and also during installation, hence returning the below error whenever I try to import utils.initialization, AttributeError: module 'catalyst.utils' has no attribute 'initialization'

    Screenshots

    image image

    Expected behavior

    bug help wanted 
    opened by Klins101 2
  • Multi Criterion Training

    Multi Criterion Training

    Error in Multi Criterion Training

    <

    weights = [0.2,0.3]
    class_weights = torch.FloatTensor(weights).to(device) #.cuda()
    criterion = {"CE_Loss1": nn.CrossEntropyLoss(weight=class_weights),"CE_Loss2": nn.CrossEntropyLoss()} 
    ....
    ....
    loss1 = self.criterion["CE_Loss1"](self.batch["logits1"], self.batch["targets1"])
    loss2 = self.criterion["CE_Loss2"](self.batch["logits2"], self.batch["targets2"])
    loss_ce1ce2 = loss1 + loss2
    self.batch_metrics.update({"loss_ce1": loss1, 
                               "loss_ce2": loss2, 
                               "loss_ce1ce2": loss_ce1ce2})
    
    for key in ["loss_ce1", "loss_ce2", "loss_ce1ce2"]:
            self.meters[key].update(self.batch_metrics[key].item(), self.batch_size)
    
    if self.is_train_loader:
        self.engine.backward(loss_ce1ce2) #causing problem
        self.optimizer.step()
        self.optimizer.zero_grad()
    

    Hi, I am trying to train a model using multi-criterion. Part of code for computing the loss is shown above. Doing so I am getting the following error.

    _

    RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

    _

    Can anyone please check if I am doing the correct way?

    help wanted question 
    opened by GirinChutia 3
  •  utils.process_model_params

    utils.process_model_params

    This was an error when I was running my program. [ AttributeError: module 'catalyst.utils' has no attribute 'process_model_params' ] How can I use catalyst to process model parameters? Just like catalyst.utils.process_model_paramsgFdZafw) for issue discussion

    help wanted question wontfix 
    opened by Chris-Ran 2
  • Crashes on 2xT4 GPUs

    Crashes on 2xT4 GPUs

    🐛 Bug Report

    Catalyst fails on 2xT4 GPUs.

    We install Catalyst in the Kaggle base image. This week we wanted to release a new image with upgraded packages. It doesn't look like Catalyst was upgraded, but Accelerate was (from 0.12 to 0.13.1).

    How To Reproduce

    Steps to reproduce the behavior: Run this unit test on a 2xT4 GPU.

    Code sample

    https://github.com/Kaggle/docker-python/blob/main/tests/test_catalyst.py

    Screenshots

    Screen Shot 2022-10-18 at 9 55 46 AM

    Expected behavior

    The test passes on a P100 GPU.

    Environment

    https://gist.github.com/Philmod/0349a2cf16d76e8d20e960d750962241

    Checklist

    • [x] bug description
    • [x] steps to reproduce
    • [x] expected behavior
    • [x] environment
    • [x] code sample / screenshots

    FAQ

    Please review the FAQ before submitting an issue:

    bug help wanted wontfix 
    opened by Philmod 3
  • Custom loader stages

    Custom loader stages

    🚀 Feature Request

    In addition to loader ∈ [train, valid, infer], a user should be able to define a custom loader stage.

    Motivation

    The purpose of loader is to switch between datasets that will be fed into the pipeline. Therefore, the natural use cases are:

    1. Running inference on multiple datasets (e.g. infer_coco, infer_kodak, infer_vimeo90k, ...) and tracking their metrics in the same way as infer.
    2. Other custom stages that involve analysis using subsets of data and data loaders.

    Proposal

    self.{train/valid/infer} variables need to be converted to functions/dictionaries. For example:

    def handle_batch(batch):
        # Previously:
        if self.is_infer_coco_loader:
            ...
        # Proposed:
        if self.loader == "infer_coco":
            ...
        if self.loader == "infer_vimeo90k":
            ...
        # Or, perhaps less Pythonically:
        if self.is_loader("infer_coco"):
            ...
    

    self.is_infer_loader and similar can be kept, and perhaps even later deprecated.

    Type-hints should be naturally converted via the type transformation T -> Mapping[str, T].

    Alternatives

    Implementing entire "infer" loop from scratch in on_epoch_end for each dataset/loader. Or chaining dataloaders (kind of weird, too). These wouldn't really be clean, and would not generalize to other non-standard use cases.

    Additional context

    N/A

    Checklist

    • [x] feature proposal description
    • [x] motivation
    • [x] extra proposal context / proposal alternatives review

    FAQ

    Please review the FAQ before submitting an issue:

    enhancement help wanted 
    opened by YodaEmbedding 1
Releases(v22.04)
  • v22.04(Apr 29, 2022)

  • v22.02.1(Feb 27, 2022)

  • v22.02(Feb 13, 2022)

    [22.02] - 2022-02-13

    Tl;dr

    Added

    • Additional tests for different hardware accelerators setups. Please check out the tests/pipelines folder for more information.
    • BackwardCallback and BackwardCallbackOrder as an abstraction on top of loss.backward. Now you could easily log model gradients or transform them before OptimizerCallback.
    • CheckpointCallbackOrder for ICheckpointCallback.

    Changed

    • Minimal python version moved to 3.7, minimal PyTorch version moved to 1.4.0.
    • Engines were rewritten on top of Accelerate. First, we found these two abstractions very close to each other. Second, Accelerate provides additional user-friendly API and more stable API for "Nvidia APEX" and "Facebook Fairscale" - it does not support them.
    • SelfSupervisedRunner moved to the examples folder from the Catalyst API. The only Runners API, that will be supported in the future: IRunner, Runner, ISupervisedRunner, SupervisedRunner due to their consistency. If you are interested in any other Runner API - feel free to write your own CustomRunner and use SelfSupervisedRunner as an example.
    • Runner.{global/stage}_{batch/loader/epoch}_metrics renamed to Runner.{batch/loader/epoch}_metrics
    • CheckpointCallback rewritten from scratch.
    • Catalyst registry moved to full-imports-paths only.
    • Logger API changed to receive IRunner for all log_* methods.
    • Metric API: topk_args renamed to topk.
    • Contrib API: init imports from catalyst.contrib - removed, use from catalyst.contrib.{smth} import {smth}. Could be change to full-imports-only in future versions for stability.
    • All quickstarts, minimal examples, notebooks, and pipelines moved to the new version.
    • Codestyle moved to 89 right margin. Honestly speaking, it's much easier to maintain Catalyst with 89 right margin on MBP'16.

    Removed

    • ITrial removed.
    • Stages support removed. While we embrace stages in deep learning experiments, current hardware accelerators are not prepared well for such setups. Additionally, ~95% of dl pipelines are single-stage. Multi-stage runner support is under review. For multi-stage support, please define a CustomRunner with rewritten API.
    • Config/Hydra API support removed. Config API is under review. For now, you could write your own Config API with hydra-slayer if needed.
    • catalyst-dl scripts removed. Without Config API we don't need them anymore.
    • Nvidia Apex, Fairscale, Albumentations, Nifti, Hydra requiremets removed.
    • OnnxCallback, PruningCallback, QuantizationCallback, TracingCallback removed from callbacks API. These callbacks are under review now.

    If you have any questions on the Catalyst 22 edition updates, please join Catalyst slack for discussion.

    Source code(tar.gz)
    Source code(zip)
  • v22.02rc0(Feb 7, 2022)

    [22.02rc0] - 2022-02-07

    Tl;dr

    Beta version of Catalyst 22 edition.

    • core architecture moved to Animus-like (stages were removed)
    • engines moved to Accelerate
    • config/hydra APIs deprecated in favor of hydra-slayer-custom config runners
    • dl-based scripts removed from the API
    • self-supervised runner moved to examples - it's better to have custom still
    • contrib and utils - truncated
    • requirements - simplified
    • codestyle moved to -l 89 (better view on 16'' screen ;) )
    Source code(tar.gz)
    Source code(zip)
  • v21.12(Dec 28, 2021)

    [21.12] - 2021-12-28

    Tl;dr

    Distributed engines update (multi-node support) and many other improvements.

    Added

    • MNIST dataset for SSL banchmark (#1368)
    • MoveiLens 20M dataset #1336
    • logger property for logging customization (#1372)
    • MacridVAE example (#1363)
    • SSL benchmark results (#1374)
    • Neptune example (#1377)
    • multi-node support for engines (#1364)

    Changed

    • RL examples update to last version (#1370)
    • DDPLoaderWrapper updated to new version (#1385)
    • num_classes for classification metrics became optional (#1379)
    • colab ci/cd update to new verion

    Removed

    Fixed

    • requests requirements for catalyst[cv] added (#1371)
    • loader step counter (#1374)
    • detection example data preprocessing (#1369)
    • gradient clipping with fp16 runs (#1378)
    • config API fix for DDP runs (#1383)
    • checkpoint creation for fp16 engines (#1382)

    Contributors ❤️

    @bagxi @ditwoo @MrNightSky @Nimrais @y-ksenia @sergunya17 @Thiefwerty @zkid18

    Source code(tar.gz)
    Source code(zip)
  • v21.11(Nov 30, 2021)

    [21.11] - 2021-11-30

    Tl;dr

    Framework architecture simplification and speedup + SSL & RecSys extensions.

    Added

    • MultiVAE RecSys example (#1340)`
    • Returned resume support - resolved #1193 (#1349)
    • Smoothing dice loss to contrib (#1344)
    • profile flag for runner.train (#1348)
    • MultiDAE RecSys example (#1356)
    • SETTINGS.log_batch_metrics, SETTINGS.log_epoch_metrics, SETTINGS.compute_per_class_metrics for framework-wise Metric & Logger APIs specification (#1357)
    • log_batch_metrics and log_epoch_metrics options for all available Loggers (#1357)
    • compute_per_class_metrics option for all available multiclass/label metrics (#1357)
    • pytorch benchmark script and simplified MNIST (#1360)

    Changed

    • A few framework simplifications were made (#1346):
      • catalyst-contrib scripts reduced to collect-env and project-embeddings only
      • catalyst-dl scripts recuded to run and tune only
      • transforms. prefix deprecated for Catalyst-based transforms
      • catalyst.tools moved to catalyst.extras
      • task-dependent extensions from catalyst.data moved to catalyst.contrib.data
      • catalyst.data.transforms moved to catalyst.contrib.data.transforms
      • Normalize, ToTensor transforms renamed to NormalizeImage, ImageToTensor
      • metric learning extensions moved to catalyst.contrib.data
      • catalyst.contrib moved to code-as-a-documentation development
      • catalyst[cv] and catalyst[ml] extensions moved to flatten architecture design; examples: catalyst.contrib.data.dataset_cv, catalyst.contrib.data.dataset_ml
      • catalyst.contrib moved to flatten architecture design; exampels: catalyst.contrib.data, catalyst.contrib.datasets, catalyst.contrib.layers, catalyst.contrib.models, catalyst.contrib.optimizers, catalyst.contrib.schedulers
      • internal functionality moved to ***._misc modules
      • catalyst.utils.mixup moved to catalyst.utils.torch
      • catalyst.utils.numpy moved to catalyst.contrib.utils.numpy
    • default logging logic moved from "batch & epoch" to "epoch"-only to save computation time during logging; to respecify, please use:
      • SETTINGS.log_batch_metrics=True/False or os.environ["CATALYST_LOG_BATCH_METRICS"]
      • SETTINGS.log_epoch_metrics=True/False or os.environ["CATALYST_LOG_EPOCH_METRICS"]
    • default metrics computation moved from "per-class & aggregations" to "aggregations"-only to save computation time during logging; to respecify, please use:
      • SETTINGS.compute_per_class_metrics=True/False or os.environ["CATALYST_COMPUTE_PER_CLASS_METRICS"]
    • no transformations required for MNIST contrib dataset (#1360

    Removed

    • A few framework simplifications were made (#1346):
      • catalyst.contrib.pandas
      • catalyst.contrib.parallel
      • catalyst.contrib.models.cv
      • a few catalyst.utils.misc functions
      • catalyst.extras removed from the public documentation

    Fixed

    • documentation search error (21.10 only) (#1346)
    • docs examples (#1362)
    • Self-Supervised benchmark: (#1365), (#1361)

    Contributors ❤️

    @asteyo @Dokholyan @Nimrais @y-ksenia @sergunya17

    Source code(tar.gz)
    Source code(zip)
  • v21.10(Oct 30, 2021)

    [21.10] - 2021-10-30

    Tl;dr

    Readmes and tutorials with a few ddp fixes.

    Added

    • RSquareLoss (#1313)
    • Self-Supervised example updates: (#1305), (#1322), (#1325), (#1335)
    • Albert training example (#1326)
    • YOLO-X (new) detection example and refactoring (#1324)
    • TopKMetric asbtraction (#1330)

    Changed

    • simlified readme (#1312)
    • improved DDP tutorial (#1327)
    • CMCMetric renamed from <prefix>cmc<suffix><k> to <prefix>cmc<k><suffix> (#1330)

    Removed

    Fixed

    • Zero seed error (#1329)
    • updated codestyle issues (#1331)
    • TopK metrics: (#1330), (#1334), (#1339)
    • --expdir param for catalyst-dl run (#1338)
    • ControlFlowCallback for distributed setup (#1341)
    Source code(tar.gz)
    Source code(zip)
  • v21.09(Sep 30, 2021)

    [21.09] - 2021-09-30

    Added

    • CometLogger support (#1283)
    • CometLogger examples (#1287)
    • XLA docs (#1288)
    • Contarstive loss functions: NTXentLoss (#1278), SupervisedContrastiveLoss (#1293)
    • Self supervised learning: ISelfSupervisedRunner, SelfSupervisedConfigRunner, SelfSupervisedRunner, SelfSupervisedDatasetWrapper (#1278)
    • SimCLR example (#1278)
    • Superivised Contrastive example (#1293)
    • extra warnings for runner-callbacks interaction (#1295)
    • CategoricalRegressionLoss and QuantileRegressionLoss to the contrib (#1295)
    • R2 score metric (#1274)

    Changed

    • Improved WandbLogger to support artifacts and fix logging steps (#1309)
    • full Runner cleanup, with callbacks and loaders destruction, moved to PipelineParallelFairScaleEngine only (#1295)
    • HuberLoss renamed to HuberLossV0 for the PyTorch compatibility (#1295)
    • codestyle update (#1298)
    • BalanceBatchSampler - deprecated (#1303)

    Removed

    Fixed

    Contributors ❤️

    @asteyo @AyushExel @bagxi @DN6 @gr33n-made @Nimrais @Podidiving @y-ksenia

    Source code(tar.gz)
    Source code(zip)
  • v21.09rc1(Sep 27, 2021)

  • v21.09rc0(Sep 27, 2021)

  • v21.08(Aug 31, 2021)

    [21.08] - 2021-08-31

    Added

    • RecSys loss functions: AdaptiveHingeLoss, BPRLoss, HingeLoss, LogisticLoss, RocStarLoss, WARPLoss (#1269, #1282)
    • object detection examples (#1271)
    • SklearnModelCallback (#1261)
    • Barlow Twins example (#1261)
    • TPU/XLA support (#1275)
    • native sync_bn support for all available engines (#1275)
      • Torch, AMP, Apex, FairScale

    Changed

    • Registry moved to hydra-slayer (#1264))
    • (#1275)
      • batch metrics sync removed from ddp-runs to speedup training process
      • AccumulationMetric renamed to AccumulativeMetric
        • moved from catalyst.metrics._metric to catalyst.metrics._accumulative
        • accululative_fields renamed to keys

    Removed

    Fixed

    • PeriodicLoaderCallback docsting (#1279)
    • matplotlib issue (#1272)
    • sample counter for the loader (#1285)

    Contributors ❤️

    @bagxi @Casyfill @ditwoo @Nimrais @penguinflys @sergunya17 @zkid18

    Source code(tar.gz)
    Source code(zip)
  • v21.07(Jul 29, 2021)

    [21.07] - 2021-07-29

    Added

    • added pre-commit hook to run codestyle checker on commit (#1257)
    • on publish github action for docker and docs added (#1260)
    • MixupCallback and utils.mixup_batch (#1241)
    • Barlow twins loss (#1259)
    • BatchBalanceClassSampler (#1262)

    Changed

    Removed

    Fixed

    • make expdir in catalyst-dl run optional (#1249)
    • Bump neptune-client from 0.9.5 to 0.9.8 in requirements-neptune.txt (#1251)
    • automatic merge for master (with Mergify) fixed (#1250)
    • Evaluate loader custom model bug was fixed (#1254)
    • BatchPrefetchLoaderWrapper issue with batch-based PyTorch samplers (#1262)
    • Adapted MlflowLogger for new config hierarchy (#1263)

    Contributors ❤️

    @AlekseySh @bagxi @Casyfill @Dokholyan @leoromanovich @Nimrais @y-ksenia

    Source code(tar.gz)
    Source code(zip)
  • v21.06(Jun 29, 2021)

    [21.06] - 2021-06-29

    Added

    • (#1230)
      • FairScale support
      • DeepSpeed support
      • utils.ddp_sync_run function for synchronous ddp run
      • CIFAR10 and CIFAR100 datasets from torchvision (no cv-based requirements)
      • Catalyst Engines demo
    • dataset_from_params support in config API (#1231)
    • transform from params support for config API added (#1236)
    • samplers from params support for config API added (#1240)
    • recursive registry.get_from_params added (#1241)
    • albumentations integration (#1238)
    • Profiler callback (#1226)

    Changed

    • (#1230)
      • loaders creation now wrapper with utils.ddp_sync_run for utils.ddp_sync_run data preparation
      • runner support stage cleanup: loaders and callbacks will be deleted on the stage end
      • Apex-based engines now support both APEXEngine and ApexEngine registry names

    Fixed

    • multiprocessing in minimal tests hotfix (#1232)
    • Tracing callback hotfix (#1234)
    • Engine hotfix for predict_loader (#1235)
    • (#1230)
      • Hydra hotfix due to 1.1.0 version changes
    • HuberLoss name conflict for pytorch 1.9 hotfix (#1239)

    Contributors ❤️

    @bagxi @y-ksenia @ditwoo @BorNick @Inkln

    Source code(tar.gz)
    Source code(zip)
  • v21.05(May 31, 2021)

    [21.05] - 2021-05-31

    Added

    • Reinforcement learning tutorials (#1205)
    • customization demo (#1207)
    • FAQ docs: multiple input and output keys, engine tutorial (#1202)
    • minimal Config API example (#1215)
    • Distributed RL example (Catalyst.RL 2.0 concepts) (#1224)
    • SklearnCallback as integration of sklearn metrics (#1198)

    Changed

    • tests moved to tests folder (#1208)
    • pipeline tests moved to tests/pipelines (#1215)
    • updated NeptuneLogger docstrings (#1223)

    Removed

    Fixed

    • customizing what happens in train() notebook (#1203)
    • transforms imports under catalyst.data (#1211)
    • change layerwise to layerwise_params (#1210)
    • add torch metrics support (#1195)
    • add Config API support for BatchTransformCallback (#1209)

    BONUS: Catalyst workshop videos!

    Source code(tar.gz)
    Source code(zip)
  • v21.04.2(Apr 30, 2021)

    [21.04.2] - 2021-04-30

    Added

    • Weights and Biases Logger (WandbLogger) (#1176)
    • Neptune Logger (NeptuneLogger) (#1196)
    • log_artifact method for logging arbitrary files like audio, video, or model weights to ILogger and IRunner (#1196)
    Source code(tar.gz)
    Source code(zip)
  • v21.04.1(Apr 19, 2021)

  • v21.04(Apr 17, 2021)

    [21.04] - 2021-04-17

    Added

    • Nifti Reader (NiftiReader) (#1151)
    • CMC score and callback for ReID task (ReidCMCMetric and ReidCMCScoreCallback) (#1170)
    • Market1501 metric learning datasets (Market1501MLDataset and Market1501QGDataset) (#1170)
    • extra kwargs support for Engines (#1156)
    • engines exception for unknown model type (#1174)
    • a few docs to the supported loggers (#1174)

    Changed

    • TensorboardLogger switched from global_batch_step counter to global_sample_step one (#1174)
    • TensorboardLogger logs loader metric on_loader_end rather than on_epoch_end (#1174)
    • prefix renamed to metric_key for MetricAggregationCallback (#1174)
    • micro, macro and weighted aggregations renamed to _micro, _macro and _weighted (#1174)
    • BatchTransformCallback updated (#1153)

    Removed

    • auto torch.sigmoid usage for metrics.AUCMetric and metrics.auc (#1174)

    Fixed

    • hitrate calculation issue (#1155)
    • ILoader wrapper usage issue with Runner (#1174)
    • counters for ddp case (#1174)
    Source code(tar.gz)
    Source code(zip)
  • v21.03.2(Mar 29, 2021)

  • v21.03.1(Mar 28, 2021)

    [21.03.1] - 2021-03-28

    Added

    • Additive Margin SoftMax(AMSoftmax)(#1125)
    • Generalized Mean Pooling(GeM)(#1084)
    • Key-value support for CriterionCallback (#1130)
    • Engine configuration through cmd (#1134)
    • Extra utils for thresholds (#1134)
    • Added gradient clipping function to optimizer callback (1124)
    • FactorizedLinear to contrib (1142)
    • Extra init params for ConsoleLogger (1142)
    • Tracing, Quantization, Onnx, Pruninng Callbacks (1127)
    • _key_value for schedulers in case of multiple optimizers fixed (#1146)

    Changed

    • CriterionCallback now inherits from BatchMetricCallback #1130)
      • united metrics computation logic

    Removed

    • Config API deprecated parsings logic (1142) (1138)

    Fixed

    • Data-Model device sync and Engine logic during runner.predict_loader (#1134)
    • BatchLimitLoaderWrapper logic for loaders with shuffle flag (#1136)
    • config description in the examples (1142)
    • Config API deprecated parsings logic (1142) (1138)
    • RecSys metrics Top_k calculations ([#1140] (https://github.com/catalyst-team/catalyst/pull/1140))
    Source code(tar.gz)
    Source code(zip)
  • v21.03(Mar 13, 2021)

    The v20 is dead, long live the v21!

    [21.03] - 2021-03-13 (#1095)

    Added

    • Engine abstraction to support various hardware backends and accelerators: CPU, GPU, multi GPU, distributed GPU, TPU, Apex, and AMP half-precision training.
    • Logger abstraction to support various monitoring tools: console, tensorboard, MLflow, etc.
    • Trial abstraction to support various hyperoptimization tools: Optuna, Ray, etc.
    • Metric abstraction to support various of machine learning metrics: classification, segmentation, RecSys and NLP.
    • Full support for Hydra API.
    • Full DDP support for Python API.
    • MLflow support for metrics logging.
    • United API for model post-processing: tracing, quantization, pruning, onnx-exporting.
    • United API for metrics: classification, segmentation, RecSys, and NLP with full DDP and micro/macro/weighted/etc aggregations support.

    Changed

    • Experiment abstraction merged into Runner one.
    • Runner, SupervisedRunner, ConfigRunner, HydraRunner architectures and dependencies redesigned.
    • Internal settings and registry mechanisms refactored to be simpler, user-friendly and more extendable.
    • Bunch of Config API test removed with Python API and pytest.
    • Codestyle now supports up to 99 symbols per line :)
    • All callbacks/runners moved for contrib to the library core if was possible.
    • Runner abstraction simplified to store only current state of the experiment run: all validation logic was moved to the callbacks (by this way, you could easily select best model on various metrics simultaneously).
    • Runner.input and Runner.output merged into united Runner.batch storage for simplicity.
    • All metric moved from catalyst.utils.metrics to catalyst.metrics.
    • All metrics now works on scores/metric-defined-input rather that logits (!).
    • Logging logic moved from Callbacks to appropriate Loggers.
    • KorniaCallbacks refactored to BatchTransformCallback.

    Removed

    • Lots of unnecessary contrib extensions.
    • Transforms configuration support through Config API (could be returned in next releases).
    • Integrated Python cmd command for model pruning, swa, etc (should be returned in next releases).
    • CallbackOrder.Validation and CallbackOrder.Logging
    • All 2020 year backward compatibility fixes and legacy support.

    Fixed

    • Docs rendering simplified.
    • LrFinderCallback.

    Release docs, Python API minimal examples, Config/Hydra API example.

    Source code(tar.gz)
    Source code(zip)
  • v21.01rc0(Jan 30, 2021)

  • v20.12(Dec 20, 2020)

    [20.12] - 2020-12-20

    Added

    • CVS Logger (#1005)
    • DrawMasksCallback (#999)
    • (#1002)
      • a few docs
    • (#998)
      • reciprocal_rank metric
      • unified recsys metrics preprocessing
    • (#1018)
      • readme examples for all supported metrics under catalyst.metrics
      • wrap_metric_fn_with_activation for model outputs wrapping with activation
      • extra tests for metrics
    • (#1039)
      • per_class=False option for metrics callbacks
      • PrecisionCallack, RecallCallack for multiclass problems
      • extra docs

    Changed

    • docs update (#1000)
    • AMPOptimizerCallback and OptimizerCallback were merged (#1007)
    • (#1017)
      • fixed bug in SchedulerCallback
      • Log LRs and momentums for all param groups, not only for the first one
    • (#1002)
      • tensorboard, ipython, matplotlib, pandas, scikit-learn moved to optional requirements
      • PerplexityMetricCallback moved to catalyst.callbacks from catalyst.contrib.callbacks
      • PerplexityMetricCallback renamed to PerplexityCallback
      • catalyst.contrib.utils.confusion_matrix renamed to catalyst.contrib.utils.torch_extra
      • many parts of catalyst.data moved to catalyst.contrib.data
      • catalyst.data.scripts moved to catalyst.contrib.scripts
      • catalyst.utils, catalyst.data.utils and catalyst.contrib.utils restructured
      • ReaderSpec renamed to IReader
      • SupervisedExperiment renamed to AutoCallbackExperiment
    • gain functions renamed for dcg/ndcg metrics (#998)
    • (#1014)
      • requirements respecification: catalyst[cv], catalyst[dev], catalyst[log], catalyst[ml], catalyst[nlp],catalyst[tune]
      • settings respecification
      • extra tests for settings
      • contrib refactoring
    • iou and dice metrics moved to per-class computation (#1031)

    Removed

    • (#1002)
      • KNNMetricCallback
      • sklearn mode for ConfusionMatrixLogger
      • catalyst.data.utils
      • unnecessary catalyst.tools.meters
      • todos for unnecessary docs
    • (#1014)
      • transformers-based contrib (too unstable)
    • (#1018)
      • ClasswiseIouCallback/ClasswiseJaccardCallback as deprecated on (should be refactored in future releases)

    Fixed

    • prevented modifying config during the experiment and runner initialization (#1004)
    • a few test for RecSys MAP computation (#1018)
    • leave batch size the same for default distributed training (#1023)
    • (#1032)
      • Apex: now you can use apex for multiple models training
      • Apex: DataParallel is allowed for opt_level other than "O1"
    Source code(tar.gz)
    Source code(zip)
  • v20.11(Dec 20, 2020)

    [20.11] - 2020-11-12

    Added

    • DCG, nDCG metrics (#881)
    • MAP calculations #968
    • hitrate calculations [#975] (https://github.com/catalyst-team/catalyst/pull/975)
    • extra functions for classification metrics (#966)
    • OneOf and OneOfV2 batch transforms (#951)
    • precision_recall_fbeta_support metric (#971)
    • Pruning tutorial (#987)
    • BatchPrefetchLoaderWrapper (#986)
    • DynamicBalanceClassSampler (#954)

    Changed

    • update Catalyst version to 20.10.1 for tutorials (#967)
    • added link to dl-course (#967)
    • IRunner -> simplified IRunner (#984)
    • docs were restructured (#985)
    • set_global_seed moved from utils.seed to utils.misc (#986)

    Removed

    • several deprecated tutorials (#967)
    • several deprecated func from utils.misc (#986)

    Fixed

    • BatchTransformCallback - add nn.Module transforms support (#951)
    • moved to contiguous view for accuracy computation (#982)
    • fixed torch warning on optimizer.py:140 (#979)
    Source code(tar.gz)
    Source code(zip)
  • v20.10.1(Dec 20, 2020)

    [20.10.1] - 2020-10-15

    Added

    • MRR metrics calculation (#886)
    • docs for MetricCallbacks (#947)
    • SoftMax, CosFace, ArcFace layers to contrib (#939)
    • ArcMargin layer to contrib (#957)
    • AdaCos to contrib (#958)
    • Manual SWA to utils (#945)

    Changed

    • fixed path to CHANGELOG.md file and add information about unit test to PULL_REQUEST_TEMPLATE.md ([#955])(https://github.com/catalyst-team/catalyst/pull/955)
    • catalyst-dl tune config specification - now optuna params are grouped under study_params (#947)
    • IRunner._prepare_for_stage logic moved to IStageBasedRunner.prepare_for_stage (#947)
      • now we create components in the following order: datasets/loaders, model, criterion, optimizer, scheduler, callbacks
    • MnistMLDataset and MnistQGDataset data split logic - now targets of the datasets are disjoint (#949)
    • architecture redesign (#953)
      • experiments, runners, callbacks grouped by primitives under catalyst.experiments/catalyst.runners/catalyst.callbacks respectively
      • settings and typing moved from catalyst.tools.* to catalyst.*
      • utils moved from catalyst.*.utils to catalyst.utils
    • swa moved to catalyst.utils (#963)

    Removed

    Fixed

    • AMPOptimizerCallback - fix grad clip fn support (#948)
    • removed deprecated docs types (#947) (#952)
    • docs for a few files (#952)
    • extra backward compatibility fixes (#963)
    Source code(tar.gz)
    Source code(zip)
  • v20.09.1(Dec 20, 2020)

    [20.09.1] - 2020-09-25

    Added

    • Runner registry support for Config API (#936)
    • catalyst-dl tune command - Optuna with Config API integration for AutoML hyperparameters optimization (#937)
    • OptunaPruningCallback alias for OptunaCallback (#937)
    • AdamP and SGDP to catalyst.contrib.nn.criterion (#942)

    Changed

    • Config API components preparation logic moved to utils.prepare_config_api_components (#936)

    Removed

    Fixed

    • Logging double logging :) (#936)
    • CMCCallback (#941)
    Source code(tar.gz)
    Source code(zip)
  • v20.09(Dec 20, 2020)

    [20.09] - 2020-09-07

    Added

    • MovieLens dataset loader (#903)
    • force and bert-level keywords to catalyst-data text2embedding (#917)
    • OptunaCallback to catalyst.contrib (#915)
    • DynamicQuantizationCallback and catalyst-dl quantize script for fast quantization of your model (#890)
    • Multi-scheduler support for multi-optimizer case (#923)
    • Native mixed-precision training support (#740)
    • OptiomizerCallback - flag use_fast_zero_grad for faster (and hacky) version of optimizer.zero_grad() (#927)
    • IOptiomizerCallback, ISchedulerCallback, ICheckpointCallback, ILoggerCallback as core abstractions for Callbacks (#933)
    • flag USE_AMP for PyTorch AMP usage (#933)

    Changed

    • Pruning moved to catalyst.dl (#933)
    • default USE_APEX changed to 0 (#933)

    Removed

    Fixed

    • autoresume option for Config API (#907)
    • a few issues with TF projector (#917)
    • batch sampler speed issue (#921)
    • add apex key-value optimizer support (#924)
    • runtime warning for PyTorch 1.6 (920)
    • Apex synbn usage (920)
    • Catalyst dependency on system git (922)
    Source code(tar.gz)
    Source code(zip)
  • v20.08(Dec 20, 2020)

    [20.08] - 2020-08-09

    Added

    • CMCScoreCallback (#880)
    • kornia augmentations BatchTransformCallback (#862)
    • average_precision and mean_average_precision metrics (#883)
    • MultiLabelAccuracyCallback, AveragePrecisionCallback and MeanAveragePrecisionCallback callbacks (#883)
    • minimal examples for multiclass and multilabel classification (#883)
    • experimental TPU support (#893)
    • add Imagenette, Imagewoof, and Imagewang datasets (#902)
    • IMetricCallback, IBatchMetricCallback, ILoaderMetricCallback, BatchMetricCallback, LoaderMetricCallback abstractions (#897)
    • HardClusterSampler inbatch sampler (#888)

    Changed

    • all registries merged to one catalyst.registry (#883)
    • mean_average_precision logic merged with average_precision (#897)
    • all imports moved to absolute (#905)
    • catalyst.contrib.data merged to catalyst.data (#905)
    • {breaking} Catalyst transform ToTensor was renamed to ImageToTensor (#905)
    • TracerCallback moved to catalyst.dl (#905)
    • ControlFlowCallback, PeriodicLoaderCallback moved to catalyst.core (#905)

    Removed

    • average_accuracy and mean_average_accuracy metrics (#883)
    • MultiMetricCallback abstraction (#897)

    Fixed

    • utils.tokenize_text typo with punctuation (#880)
    • ControlFlowCallback logic (#892)
    • docs (#897)
    Source code(tar.gz)
    Source code(zip)
  • v20.07(Dec 20, 2020)

    [20.07] - 2020-07-06

    Added

    • log parameter to WandbLogger (#836)
    • hparams experiment property (#839)
    • add docs build on push to master branch (#844)
    • WrapperCallback and ControlFlowCallback (#842)
    • BatchOverfitCallback (#869)
    • overfit flag for Config API (#869)
    • InBatchSamplers: AllTripletsSampler and HardTripletsSampler (#825)

    Changed

    • Renaming (#837)
      • SqueezeAndExcitation -> cSE
      • ChannelSqueezeAndSpatialExcitation -> sSE
      • ConcurrentSpatialAndChannelSqueezeAndChannelExcitation -> scSE
      • _MetricCallback -> IMetricCallback
      • dl.Experiment.process_loaders -> dl.Experiment._get_loaders
    • LRUpdater become abstract class (#837)
    • calculate_confusion_matrix_from_arrays changed params order (#837)
    • dl.Runner.predict_loader uses _prepare_inner_state and cleans experiment (#863)
    • toml to the dependencies (#872)

    Removed

    • crc32c dependency (#872)

    Fixed

    • workflows/deploy_push.yml failed to push some refs (#864)
    • .dependabot/config.yml contained invalid details (#781)
    • LanguageModelingDataset (#841)
    • global_* counters in Runner (#858)
    • EarlyStoppingCallback considers first epoch as bad (#854)
    • annoying numpy warning (#860)
    • PeriodicLoaderCallback overwrites best state (#867)
    • OneCycleLRWithWarmup (#851)
    Source code(tar.gz)
    Source code(zip)
  • v20.06(Jun 4, 2020)

    [20.06] - 2020-06-04

    Added

    • Mergify (#831)
    • PerplexityMetricCallback (#819)
    • PeriodicLoaderRunnerCallback (#818)

    Changed

    • docs structure were updated during (#822)
    • utils.process_components moved from utils.distributed to utils.components (#822)
    • catalyst.core.state.State merged to catalyst.core.runner._Runner (#823) (backward compatibility included)
      • catalyst.core.callback.Callback now works directly with catalyst.core.runner._Runner
      • state_kwargs renamed to stage_kwargs

    Removed

    Fixed

    • added missed dashes in docker perfixes (#828)

    [20.05.1] - 2020-05-23

    Added

    • Circle loss implementation (#802)
    • BatchBalanceSampler for metric learning and classification (#806)
    • CheckpointCallback: new argument load_on_stage_start which accepts str and Dict[str, str] (#797)
    • LanguageModelingDataset to catalyst[nlp] (#808)
    • Extra counters for batches, loaders and epochs (#809)
    • TracerCallback (#789)

    Changed

    • CheckpointCallback: additional logic for argument load_on_stage_end - accepts str and Dict[str, str] (#797)
    • counters names for batches, loaders and epochs (#809)
    • utils.trace_model: changed logic - runner argument was changed to predict_fn (#789)
    • redesigned contrib.data and contrib.datasets (#820)
    • catalyst.utils.meters moved to catalyst.tools (#820)
    • catalyst.contrib.utils.tools.tensorboard moved to catalyst.contrib.tools (#820)

    Removed

    Fixed

    Source code(tar.gz)
    Source code(zip)
  • v20.06.rc1(Jun 1, 2020)

ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
Self-training for Few-shot Transfer Across Extreme Task Differences

Self-training for Few-shot Transfer Across Extreme Task Differences (STARTUP) Introduction This repo contains the official implementation of the follo

Cheng Perng Phoo 33 Oct 31, 2022
Bridging Vision and Language Model

BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果,超过了同期其他常见的多模态预训练模型(例如UNITER、CLIP)。 BriVL论文:WenLan: Bridgi

235 Dec 27, 2022
Cluttered MNIST Dataset

Cluttered MNIST Dataset A setup script will download MNIST and produce mnist/*.t7 files: luajit download_mnist.lua Example usage: local mnist_clutter

DeepMind 50 Jul 12, 2022
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

Hao Tang 131 Dec 07, 2022
DCGAN LSGAN WGAN-GP DRAGAN PyTorch

Recommendation Our GAN based work for facial attribute editing - AttGAN. News 8 April 2019: We re-implement these GANs by Tensorflow 2! The old versio

Zhenliang He 408 Nov 30, 2022
Certified Patch Robustness via Smoothed Vision Transformers

Certified Patch Robustness via Smoothed Vision Transformers This repository contains the code for replicating the results of our paper: Certified Patc

Madry Lab 35 Dec 14, 2022
When BERT Plays the Lottery, All Tickets Are Winning

When BERT Plays the Lottery, All Tickets Are Winning Large Transformer-based models were shown to be reducible to a smaller number of self-attention h

Sai 16 Nov 10, 2022
Code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Semi-supervised Deep Kernel Learning This is the code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data

58 Oct 26, 2022
Real Time Object Detection and Classification using Yolo Algorithm.

Real time Object detection & Classification using YOLO algorithm. Real Time Object Detection and Classification using Yolo Algorithm. What is Object D

Ketan Chawla 1 Apr 17, 2022
Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction Requirements The code has been tested running under Python 3.7.4, with the foll

zshicode 84 Jan 01, 2023
Faster RCNN with PyTorch

Faster RCNN with PyTorch Note: I re-implemented faster rcnn in this project when I started learning PyTorch. Then I use PyTorch in all of my projects.

Long Chen 1.6k Dec 23, 2022
Dogs classification with Deep Metric Learning using some popular losses

Tsinghua Dogs classification with Deep Metric Learning 1. Introduction Tsinghua Dogs dataset Tsinghua Dogs is a fine-grained classification dataset fo

QuocThangNguyen 45 Nov 09, 2022
PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

AttentionHTR PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks. Scene Text

Dmitrijs Kass 31 Dec 22, 2022
Wafer Fault Detection using MlOps Integration

Wafer Fault Detection using MlOps Integration This is an end to end machine learning project with MlOps integration for predicting the quality of wafe

Sethu Sai Medamallela 0 Mar 11, 2022
PyKaldi GOP-DNN on Epa-DB

PyKaldi GOP-DNN on Epa-DB This repository has the tools to run a PyKaldi GOP-DNN algorithm on Epa-DB, a database of non-native English speech by Spani

18 Dec 14, 2022
Material del curso IIC2233 Programación Avanzada 📚

Contenidos Los contenidos se organizan según la semana del semestre en que nos encontremos, y según la semana que se destina para su estudio. Los cont

IIC2233 @ UC 72 Dec 23, 2022
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 07, 2023
One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Introduction One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing". Users

seq-to-mind 18 Dec 11, 2022