Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Related tags

Machine Learningmlops
Overview

Federal University of Rio Grande do Norte

Technology Center

Department of Computer Engineering and Automation

Machine Learning Based Systems Design

References

  • πŸ“š Noah Gift, Alfredo Deza. Practical MLOps: Operationalizing Machine Learning Models [Link]
  • πŸ“š Chip Huyen. Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications. [Link]
  • πŸ“š Hannes Hapke, Catherine Nelson. Building Machine Learning Pipelines. [Link]
  • πŸ“š Mariano Anaya. Clean Code in Python [Link]
  • πŸ“š AurΓ©lien GΓ©ron. Hands on Machine Learning with Scikit-Learn, Keras and TensorFlow. [Link]
  • 🀜 Dataquest Academic Program [Link]
  • πŸ˜ƒ CS329S - ML Systems Design [Link]
  • 🎯 Machine Learning Operations [Link]

Lessons

Week 01: Course Outline Open in PDF

  • Git and Version Control Open in Dataquest
    • You'll learn how to: a) organize your code using version control, b) resolve conflicts in version control, c) employ Git and Github to collaborate with others.
    • πŸ‘Š U1T1: guided project + getting a git repository.

Week 02: CLI fundamentals

  • Elements of the Command Line Open in Dataquest
    • You'll learn how to: a) employ the command line for Data Science, b) modify the behavior of commands with options, c) employ glob patterns and wildcards, d) define Important command line concepts, e) navigate he filesystem, f) manage users and permissions.
  • Text Processing in the Command Line Open in Dataquest
    • You'll learn how to: a) read and explore documentation, b) perform basic text processing, c) redirect and pipe output, d) inspect files, e) define different kinds of output, f) employ streams and file descriptors.
  • πŸ”  U1T2: working with command line.

Week 03 - Clean Code Principles for Data Science and Machine Learning Open in PDF

  • Outline Open in Loom
  • Coding Best Practices Open in Loom
  • Writing Clean Code Open in Loom
  • Refactoring Code Open in Loom
  • Efficient Code Open in Loom
  • Documentation Open in Loom
  • Python Code Quality Authority (PCQA) - pycodestyle Open in Loom
  • PCQA - pylint Open in Loom
  • PCQA - autopep8 Open in Loom
  • PCQA - nbQA Open in Loom
  • ▢️ Hands on
    • πŸ’Ύ Datasets [Link]
    • Writting Clean Code Jupyter
    • Exercise 01 Jupyter
    • Exercise 02 Jupyter
    • Exercise 03 Jupyter
    • Using pycodestyle Jupyter
    • Using pylint - script Python refactored script Python
    • Functions: Advanced - Best practices for writing functions Open in Dataquest

Week 04 Production Ready Code Open in PDF

  • Outline Open in Loom
  • Catching Errors Open in Loom
  • Testing and Data Science Open in Loom
  • A brief introduction about pytest Open in Loom
  • Logging Open in Loom
  • Case study: testing and logging Open in Loom
  • Model Drift Open in Loom
  • Hands on
    • Production ready code Jupyter
    • Data Visualization Fundamentals Open in Dataquest
      • You will learn how to: a) how to use data visualization to explore data and b) how and when to use the most common plots.
    • Storytelling Data Visualization and Information Design Open in Dataquest
      • You will learn how to: a) Create graphs using information design principles, b) create narrative data visualizations using Matplotlib, c) create visual patterns using Gestalt principles, d) control attention using pre-attentive attributes and e) employ Matplotlib's built-in styles.
Owner
Ivanovitch Silva
I'm an experimenter by design, and very interested in technologies related to Data Science & Machine Learning, Vehicles and Complex Networks.
Ivanovitch Silva
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Spark Python Notebooks This is a collection of IPython notebook/Jupyter notebooks intended to train the reader on different Apache Spark concepts, fro

Jose A Dianes 1.5k Jan 02, 2023
ArviZ is a Python package for exploratory analysis of Bayesian models

ArviZ (pronounced "AR-vees") is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, model checking, comparison and diagnostics

ArviZ 1.3k Jan 05, 2023
The code from the Machine Learning Bookcamp book and a free course based on the book

The code from the Machine Learning Bookcamp book and a free course based on the book

Alexey Grigorev 5.5k Jan 09, 2023
This is a Machine Learning model which predicts the presence of Diabetes in Patients

Diabetes Disease Prediction This is a machine Learning mode which tries to determine if a person has a diabetes or not. Data The dataset is in comma s

Edem Gold 4 Mar 16, 2022
Reproducibility and Replicability of Web Measurement Studies

Reproducibility and Replicability of Web Measurement Studies This repository holds additional material to the paper "Reproducibility and Replicability

6 Dec 31, 2022
Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale.

Model Search Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers sp

AriesTriputranto 1 Dec 13, 2021
dirty_cat is a Python module for machine-learning on dirty categorical variables.

dirty_cat dirty_cat is a Python module for machine-learning on dirty categorical variables.

637 Dec 29, 2022
Python/Sage Tool for deriving Scattering Matrices for WDF R-Adaptors

R-Solver A Python tools for deriving R-Type adaptors for Wave Digital Filters. This code is not quite production-ready. If you are interested in contr

8 Sep 19, 2022
Relevance Vector Machine implementation using the scikit-learn API.

scikit-rvm scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API. Quicks

James Ritchie 204 Nov 18, 2022
SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow

SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow, in High Performance Computing (HPC) simulations and workloads.

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

141 Dec 27, 2022
Distributed Evolutionary Algorithms in Python

DEAP DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data stru

Distributed Evolutionary Algorithms in Python 4.9k Jan 05, 2023
Pandas Machine Learning and Quant Finance Library Collection

Pandas Machine Learning and Quant Finance Library Collection

148 Dec 07, 2022
LightGBM + Optuna: no brainer

AutoLGBM LightGBM + Optuna: no brainer auto train lightgbm directly from CSV files auto tune lightgbm using optuna auto serve best lightgbm model usin

Rishiraj Acharya 22 Dec 15, 2022
Combines Bayesian analyses from many datasets.

PosteriorStacker Combines Bayesian analyses from many datasets. Introduction Method Tutorial Output plot and files Introduction Fitting a model to a d

Johannes Buchner 19 Feb 13, 2022
Machine Learning for RC Cars

Suiron Machine Learning for RC Cars Prediction visualization (green = actual, blue = prediction) Click the video below to see it in action! Dependenci

Kendrick Tan 706 Jan 02, 2023
This project impelemented for midterm of the Machine Learning #Zoomcamp #Alexey Grigorev

MLProject_01 This project impelemented for midterm of the Machine Learning #Zoomcamp #Alexey Grigorev Context Dataset English question data set file F

Hadi Nakhi 1 Dec 18, 2021
A simple application that calculates the probability distribution of a normal distribution

probability-density-function General info An application that calculates the probability density and cumulative distribution of a normal distribution

1 Oct 25, 2022
CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

SmartSim Example Zoo This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning appl

Cray Labs 14 Mar 30, 2022