Model Training as a CI/CD System

This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed workflow in the below section, but it is about rebuilding and redeploying (continuous integration) the currently deployed machine learning pipeline based on changes in code. Such changes could happen in the training data, data pre-processing logic, model architecture and training code, custom pipeline components, and so on.

Workflow #1

We create initial code, or we make some changes in the existing codebase for pipeline.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
- Cloud Build containerizes the current codebase. This is an optional step. If you have any custom components unchanges, this step might be omitted.
  - The Cloud Build compiles a new pipeline. It creates an updated docker image, and it uploads the new docker image to GCR
- If there is any codes changed in data preprocessing, modeling, training steps, we only have to upload those source files to designated GCS bucket
The final step of the Cloud Build is to execute a pipeline run on Vertex AI

Workflow #2

Workflow in a nutshell

We create initial code, or we make some changes in the existing codebase for modules.
Based on the changes in the step 2, a GitHub action gets triggered to initiate a Cloud Build process.
The Cloud Build runs unit tests to see if those components work without errors.
If there is no error at all, there are two common sub-workflows from this point.
- If there is any codes changed in data preprocessing and models, we only have to upload those source files to designated GCS bucket.
The final step of the Cloud Build is to execute a pipeline run on Vertex AI. Trainer and Transform TFX components will look up the changed modules accordingly.

Acknowledgements

ML-GDE program for providing GCP credits.

Demonstration of the Model Training as a CI/CD System in Vertex AI

Related tags

Overview

Model Training as a CI/CD System

Workflow #1

Workflow #2

Workflow in a nutshell

Acknowledgements

Owner

Chansung Park

Deep Residual Learning for Image Recognition

Unbiased Learning To Rank Algorithms (ULTRA)

The object detection pipeline is based on Ultralytics YOLOv5

Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Efficient Training of Audio Transformers with Patchout

Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

Code for the bachelors-thesis flaky fault localization

Grammar Induction using a Template Tree Approach

Structure-Preserving Deraining with Residue Channel Prior Guidance (ICCV2021)

This repository contains datasets and baselines for benchmarking Chinese text recognition.

Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

CVNets: A library for training computer vision networks

A model which classifies reviews as positive or negative.

Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch

PyTorch implementation of EigenGAN

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

Group Activity Recognition with Clustered Spatial Temporal Transformer

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size