OSLO: Open Source framework for Large-scale transformer Optimization

Last update: Nov 24, 2022

Related tags

Deep Learning oslo

Overview

O S L O

Open Source framework for Large-scale transformer Optimization

What's New:

December 21, 2021 Released OSLO 1.0.

What is OSLO about?

OSLO is a framework that provides various GPU based optimization features for large-scale modeling. As of 2021, the Hugging Face Transformers is being considered de facto standard. However, it does not best fit the purposes of large-scale modeling yet. This is where OSLO comes in. OSLO is designed to make it easier to train large models with the Transformers. For example, you can fine-tune GPTJ on the Hugging Face Model Hub without many extra efforts using OSLO. Currently, GPT2, GPTNeo, and GPTJ are supported, but we plan to support more soon.

Installation

OSLO can be easily installed using the pip package manager. All the dependencies such as torch, transformers, dacite, ninja and pybind11 should be installed automatically with the following command. Be careful that the 'core' in the PyPI project name.

pip install oslo-core

Some of features rely on the C++ language. So we provide an option, CPP_AVAILABLE, to decide whether or not you install them.

If the C++ is available:

CPP_AVAILABLE=1 pip install oslo-core

If the C++ is not available:

CPP_AVAILABLE=0 pip install oslo-core

Note that the default value of CPP_AVAILABLE is 0 in Windows and 1 in Linux.

Key Features

import deepspeed 
from oslo import GPTJForCausalLM

# 1. 3D Parallelism
model = GPTJForCausalLM.from_pretrained_with_parallel(
    "EleutherAI/gpt-j-6B", tensor_parallel_size=2, pipeline_parallel_size=2,
)

# 2. Kernel Fusion
model = model.fuse()

# 3. DeepSpeed Support
engines = deepspeed.initialize(
    model=model.gpu_modules(), model_parameters=model.gpu_paramters(), ...,
)

# 4. Data Processing
from oslo import (
    DatasetPreprocessor, 
    DatasetBlender, 
    DatasetForCausalLM, 
    ...    
)

OSLO offers the following features.

3D Parallelism: The state-of-the-art technique for training a large-scale model with multiple GPUs.
Kernel Fusion: A GPU optimization method to increase training and inference speed.
DeepSpeed Support: We support DeepSpeed which provides ZeRO data parallelism.
Data Processing: Various utilities for efficient large-scale data processing.

See USAGE.md to learn how to use them.

Administrative Notes

Citing OSLO

If you find our work useful, please consider citing:

@misc{oslo,
  author       = {Ko, Hyunwoong and Kim, Soohwan and Park, Kyubyong},
  title        = {OSLO: Open Source framework for Large-scale transformer Optimization},
  howpublished = {\url{https://github.com/tunib-ai/oslo}},
  year         = {2021},
}

Licensing

The Code of the OSLO project is licensed under the terms of the Apache License 2.0.

Acknowledgements

The OSLO project is built with GPU support from the AICA (Artificial Intelligence Industry Cluster Agency).

Comments

[WIP] Implement ZeRO Stage 3 (FSDP)
Title

Implement ZeRO Stage 3 (FullyShardedDataParallel)

Description

[x] Add reduce_scatter_bucketer.py

[x] Add test_reduce_scatter_bucketer.py

[x] Add flatten_params_wrapper.py

[x] Add test_flatten_params_wrapper.py

[x] Add containers.py

[x] Add test_containers.py

[x] Add parallel.py

[x] Add test_parallel.py

[x] Add fsdp_optim_utils.py

[x] Update fsdp.py

[x] Add auto_wrap.py

[x] Add test_wrap.py
opened by jinok2im 9
FusedAdam & CPUAdam
Title

-FusedAdam & CPUAdam

Description

Implement FusedAdam & CPUAdam

Tasks

[x] Implement FusedAdam

[x] implement CPUAdam

[x] Test FusedAdam

[x] Test CPUAdam

[x] Test FusedSclaeMaskSoftmax (Name changed)
opened by cozytk 6
[WIP] Add data processing modules referring to the lassl
Title

add data processing modules referring to the lassl

Description

brought data processing functions that fit gpt2 with reference to lassl

Linked Issues

None
opened by gimmaru 6
Implementation of Sequential Parallelism
SP with DP implementation

Implemented SP wrapper with DP

Description

SequenceDataParallel works like native torch DDP with SP

you can find details in the file oslo/tests/torch/nn/parallal/data_parallel/test_sp.py
opened by ohwi 5
Update data collators and Add models
Title

Update data collators and Add models

Description

Updated data collators to utilize sequence parallel in Oslo trainer

Add models by referring to the transformers library
opened by gimmaru 3
Implement Expert Parallel and Test for Initialization and Forward Pass
Title

Implement Expert Parallel and Test for Initialization and Forward Pass

Description

Implement Wrapper, Modules and Features for Expert Parallel

Implement mapping_utils._ParallelMappingForHuggingFace as super class of _TensorParallelMappingForHuggingFace and _ExpertParallelMappingForHuggingFace

Test initialization and forward pass for expert parallel
opened by scsc0511 3
Integrate Sequence Parallelism branches
Title

Sequence parallelism (feat. @reniew, @ohwi, @l-yohai)

Description

This PR is Integration of SP current version. But there is something wrong.

We will fix the bugs for the coming week and write test modules according to the SP design.

It did not include the contents of the branch that worked for the test.
opened by l-yohai 3
implement tp-3d layers, wrapper, test codes and refactor all tp test codes and layers
implement tp-3d wrapper

rank transpose problem (tensor_3d_input_rank <-> tensor_3d_output_rank) by implementing ranking transpose function.

revise tp-3d layers for huggingface compatibility

implement tp-3d test codes

refactor all tp test codes

unify format across all tensor parallel modules.
opened by bzantium 2
Refactoring MultiheadAttention with todo anchors
Title

Refactoring MultiheadAttention with todo anchors

Description

Refactoring oslo/torch/nn/modules/functional/multi_head_attention_forward.py.

Remove unnecessary or unintended code and clean up annotations.

Unify return format and the variable name with native torch.

Additionally, I need to test attention_mask. However, it seems that it can proceed with this part after FusedScaleMaskSoftmax is integrated.

cc. @hyunwoongko @ohwi
opened by l-yohai 2
Add tp-1d layers testing
Add testing for tp-1d layers: col_linear, row_linear, vocab_embedding_1d

modify number to integer variable like summa_dim, world_size cc: @hyunwoongko
opened by bzantium 2
[WIP] add test code of sp training
Title

SP Model Test Code

Description

Writing a test code to verify that the gradient and loss values of the model are the same when the sequence parallelism is applied.

WIP - merging @ohwi 's test code comparing SP of ColossalAI and simple learning model.
opened by l-yohai 2

Releases(v2.0.2)

v2.0.2(Aug 25, 2022)
Revert oslo to 1.1.2.

Source code(tar.gz)
Source code(zip)
v2.0.1(Feb 20, 2022)
Merge changes from functorch upstream.

Fix documents and tutorials

Source code(tar.gz)
Source code(zip)
v2.0.0(Feb 14, 2022)
Official release of OSLO 2.0.0 🎉🎉

This version of OSLO provides the following features:

Tensor model parallelism

Efficient activation checkpointing

Kernel fusion

We plan to add the pipeline model parallelism and the ZeRO optimization in the next versions.

New feature: Kernel Fusion

{ "kernel_fusion": { "enable": "bool", "memory_efficient_fusion": "bool", "custom_cuda_kernels": "list" } }

For more information, please check the kernel fusion tutorial
Source code(tar.gz)
Source code(zip)
v2.0.0a2(Feb 2, 2022)

Quick fix of cuda rng state tracker
Source code(tar.gz)
Source code(zip)

v2.0.0a1(Feb 2, 2022)

Add activation checkpointing

You can use efficient activation checkpointing using OSLO with the following configuration.

model = oslo.initialize(
    model,
    config={
        "model_parallelism": {
            "enable": True,
            "tensor_parallel_size": YOUR_TENSOR_PARALLEL_SIZE,
        },
        "activation_checkpointing": {
            "enable": True,
            "cpu_checkpointing": True,
            "partitioned_checkpointing": True,
            "contiguous_checkpointing": True,
        },
    },
)

Tutorial: https://tunib-ai.github.io/oslo/TUTORIALS/activation_checkpointing.html

Source code(tar.gz)
Source code(zip)

v2.0.0a0(Jan 30, 2022)
New API

We paid homage to DeepSpeed. Now it's easier and simpler to use.

import oslo model = oslo.initialize(model, config="oslo-config.json")

Add new models

Albert

Bert

Bart

T5

GPT2

GPTNeo

GPTJ

Electra

Roberta

Add document

https://tunib-ai.github.io/oslo

Remove old pipeline parallelism, kernel fusion code

We'll refurbish them using the latest methods

Kernel fusion: AOTAutograd

Pipeline parallelism: Sagemaker PP

Source code(tar.gz)
Source code(zip)
v.1.1.2(Jan 15, 2022)
Updates

[#7] Selective Kernel Fusion [#9] Fix argument bug

New Feature: Selective Kernel Fusion

Since version 1.1.2, you can fuse only partial kernels, not all kernels. Currently, only Attention class and MLP class are supported.

from oslo import GPT2MLP, GPT2Attention # MLP only fusion model.fuse([GPT2MLP]) # Attention only fusion model.fuse([GPT2Attention]) # MLP + Attention fusion model.fuse([GPT2MLP, GPT2Attention])
Source code(tar.gz)
Source code(zip)

v1.1(Dec 29, 2021)

[#3] Add deployment launcher of Parallelformers into OSLO.

from oslo import GPTNeoForCausalLM

model = GPTNeoForCausalLM.from_pretrained_with_parallel(
    "EleutherAI/gpt-neo-2.7B",
    tensor_parallel_size=2,
    pipeline_parallel_size=2,
    deployment=True  # <-- new feature !
)

You can easily use deployment launcher by deployment=True. Please refer to USAGE.md for more details.

Source code(tar.gz)
Source code(zip)

v1.0.1(Dec 22, 2021)
Quick Fix

Support Megatron-LM style (.jsonl) file preprecessing.

Source code(tar.gz)
Source code(zip)
v1.0(Dec 21, 2021)
O S L O

Open Source framework for Large-scale transformer Optimization

What's New:

December 21, 2021 Released OSLO 1.0.

What is OSLO about?

OSLO is a framework that provides various GPU based optimization features for large-scale modeling. As of 2021, the Hugging Face Transformers is being considered de facto standard. However, it does not best fit the purposes of large-scale modeling yet. This is where OSLO comes in. OSLO is designed to make it easier to train large models with the Transformers. For example, you can fine-tune GPTJ on the Hugging Face Model Hub without many extra efforts using OSLO. Currently, GPT2, GPTNeo, and GPTJ are supported, but we plan to support more soon.

Installation

OSLO can be easily installed using the pip package manager. All the dependencies such as torch, transformers, dacite, ninja and pybind11 should be installed automatically with the following command. Be careful that the 'core' in the PyPI project name.

pip install oslo-core

Some of features rely on the C++ language. So we provide an option, CPP_AVAILABLE, to decide whether or not you install them.

If the C++ is available:

CPP_AVAILABLE=1 pip install oslo-core

If the C++ is not available:

CPP_AVAILABLE=0 pip install oslo-core

Note that the default value of CPP_AVAILABLE is 0 in Windows and 1 in Linux.

Key Features

import deepspeed from oslo import GPTJForCausalLM # 1. 3D Parallelism model = GPTJForCausalLM.from_pretrained_with_parallel( "EleutherAI/gpt-j-6B", tensor_parallel_size=2, pipeline_parallel_size=2, ) # 2. Kernel Fusion model = model.fuse() # 3. DeepSpeed Support engines = deepspeed.initialize( model=model.gpu_modules(), model_parameters=model.gpu_paramters(), ..., ) # 4. Data Processing from oslo import ( DatasetPreprocessor, DatasetBlender, DatasetForCausalLM, ... )

OSLO offers the following features.

3D Parallelism: The state-of-the-art technique for training a large-scale model with multiple GPUs.

Kernel Fusion: A GPU optimization method to increase training and inference speed.

DeepSpeed Support: We support DeepSpeed which provides ZeRO data parallelism.

Data Processing: Various utilities for efficient large-scale data processing.

See USAGE.md to learn how to use them.

Administrative Notes

Citing OSLO

If you find our work useful, please consider citing:

@misc{oslo, author = {Ko, Hyunwoong and Kim, Soohwan and Park, Kyubyong}, title = {OSLO: Open Source framework for Large-scale transformer Optimization}, howpublished = {\url{https://github.com/tunib-ai/oslo}}, year = {2021}, }

Licensing

The Code of the OSLO project is licensed under the terms of the Apache License 2.0.

Copyright 2021 TUNiB Inc. http://www.tunib.ai All Rights Reserved.

Acknowledgements

The OSLO project is built with GPU support from the AICA (Artificial Intelligence Industry Cluster Agency).
Source code(tar.gz)
Source code(zip)

Owner

TUNiB

TUNiB Inc.

GitHub Repository

Unbiased Learning To Rank Algorithms (ULTRA)

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels.

71 Dec 01, 2022

This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.

Awesome-Projects-Collection Quality over Quantity :) What to do? Add some unique and amazing projects as per your favourite tech stack for the communi

178 Jan 01, 2023

Chunkmogrify: Real image inversion via Segments

Chunkmogrify: Real image inversion via Segments Teaser video with live editing sessions can be found here This code demonstrates the ideas discussed i

112 Jan 04, 2023

RodoSol-ALPR Dataset

RodoSol-ALPR Dataset This dataset, called RodoSol-ALPR dataset, contains 20,000 images captured by static cameras located at pay tolls owned by the Ro

45 Dec 15, 2022

TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Parameterization of Hypercomplex Multiplications (PHM) This repository contains the TensorFlow implementation of PHM (Parameterization of Hypercomplex

9 Oct 26, 2022

Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Leaded Gradient Method (LGM) This repository contains the PyTorch implementation for paper Dynamics-aware Adversarial Attack of 3D Sparse Convolution

2 Oct 18, 2022

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

extrinsic2pyramid Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space Intro A very simple and straightforward modu

106 Dec 28, 2022

Awesome Transformers in Medical Imaging

This repo supplements our Survey on Transformers in Medical Imaging Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat,

666 Jan 06, 2023

Resources for the Ki testnet challenge

Ki Testnet Challenge This repository hosts ki-testnet-challenge. A set of scripts and resources to be used for the Ki Testnet Challenge What is the te

23 Aug 08, 2022

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

80 Dec 25, 2022

TensorFlow Tutorials with YouTube Videos

TensorFlow Tutorials Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction These tutorials are intended for beginne

9.1k Jan 02, 2023

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Res2Net The official pytorch implemention of the paper "Res2Net: A New Multi-scale Backbone Architecture" Our paper is accepted by IEEE Transactions o

928 Dec 29, 2022

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Adversarial Differentiable Data Augmentation This repository provides the official PyTorch implementation of the ICRA 2021 paper: Adversarial Differen

3 Oct 15, 2022

tinykernel - A minimal Python kernel so you can run Python in your Python

37 Dec 02, 2022

OSLO: Open Source framework for Large-scale transformer Optimization

Related tags

Overview

O S L O

What's New:

What is OSLO about?

Installation

Key Features

Administrative Notes

Citing OSLO

Licensing

Acknowledgements

Comments

Title

Description

Title

Description

Tasks

Title

Description

Linked Issues

SP with DP implementation

Description

Title

Description

Title

Description

Title

Description

Title

Description

Title

Description

Releases(v2.0.2)

v2.0.2(Aug 25, 2022)

v2.0.1(Feb 20, 2022)

v2.0.0(Feb 14, 2022)

Official release of OSLO 2.0.0 🎉🎉

New feature: Kernel Fusion

v2.0.0a2(Feb 2, 2022)

v2.0.0a1(Feb 2, 2022)

Add activation checkpointing

v2.0.0a0(Jan 30, 2022)

New API

Add new models

Add document

Remove old pipeline parallelism, kernel fusion code

v.1.1.2(Jan 15, 2022)

Updates

New Feature: Selective Kernel Fusion

v1.1(Dec 29, 2021)

v1.0.1(Dec 22, 2021)

v1.0(Dec 21, 2021)

O S L O

What's New:

What is OSLO about?

Installation

Key Features

Administrative Notes

Citing OSLO

Licensing

Acknowledgements

Owner

TUNiB

Unbiased Learning To Rank Algorithms (ULTRA)

This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.

Chunkmogrify: Real image inversion via Segments

RodoSol-ALPR Dataset

TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

Awesome Transformers in Medical Imaging

Resources for the Ki testnet challenge

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

TensorFlow Tutorials with YouTube Videos

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

GANfolk: Using AI to create portraits of fictional people to sell as NFTs

Machine Learning University: Accelerated Computer Vision Class

Implementing Vision Transformer (ViT) in PyTorch

[ACMMM 2021, Oral] Code release for "Elastic Tactile Simulation Towards Tactile-Visual Perception"