The code from the Machine Learning Bookcamp book and a free course based on the book

Last update: Jan 09, 2023

Related tags

Machine Learning mlbookcamp-code

Overview

Machine Learning Bookcamp

The code from the Machine Learning Bookcamp book

Useful links:

https://mlbookcamp.com: supplimentary materials
https://datatalks.club: the place to talk about data (and the book: join the #ml-bookcamp channel to ask questions about the book and report any problems)

Machine Learning Zoomcamp

Machine Learning Zoomcamp is a course based on the book

It's online and free
You can join at any moment
More information in the course-zoomcamp folder

Reading Plan

Chapters

Chapter 1: Introduction to Machine Learning

Understanding machine learning and the problems it can solve
CRISP-DM: Organizing a successful machine learning project
Training and selecting machine learning models
Performing model validation

No code

Chapter 2: Machine Learning for Regression

Creating a car-price prediction project with a linear regression model
Doing an initial exploratory data analysis with Jupyter notebooks
Setting up a validation framework
Implementing the linear regression model from scratch
Performing simple feature engineering for the model
Keeping the model under control with regularization
Using the model to predict car prices

Code: chapter-02-car-price/02-carprice.ipynb

Chapter 3: Machine Learning for Classification

Predicting customers who will churn with logistic regression
Doing exploratory data analysis for identifying important features
Encoding categorical variables to use them in machine learning models
Using logistic regression for classification

Code: chapter-03-churn-prediction/03-churn.ipynb

Chapter 4: Evaluation Metrics for Classification

Accuracy as a way of evaluating binary classification models and its limitations
Determining where our model makes mistakes using a confusion table
Deriving other metrics like precision and recall from the confusion table
Using ROC and AUC to further understand the performance of a binary classification model
Cross-validating a model to make sure it behaves optimally
Tuning the parameters of a model to achieve the best predictive performance

Code: chapter-03-churn-prediction/04-metrics.ipynb

Chapter 5: Deploying Machine Learning Models

Saving models with Pickle
Serving models with Flask
Managing dependencies with Pipenv
Making the service self-contained with Docker
Deploying it to the cloud using AWS Elastic Beanstalk

Code: chapter-05-deployment

Chapter 6: Decision Trees and Ensemble Learning

Predicting the risk of default with tree-based models
Decision trees and the decision tree learning algorithm
Random forest: putting multiple trees together into one model
Gradient boosting as an alternative way of combining decision trees

Code: chapter-06-trees/06-trees.ipynb

Chapter 7: Neural Networks and Deep Learning

Convolutional neural networks for image classification
TensorFlow and Keras — frameworks for building neural networks
Using pre-trained neural networks
Internals of a convolutional neural network
Training a model with transfer learning
Data augmentations — the process of generating more training data

Code: chapter-07-neural-nets/07-neural-nets-train.ipynb

Chapter 8: Serverless Deep Learning

Serving models with TensorFlow-Lite — a light-weight environment for applying TensorFlow models
Deploying deep learning models with AWS Lambda
Exposing the Lambda function as a web service via API Gateway

Code: chapter-08-serverless

Chapter 9: Kubernetes and Kubeflow

Kubernetes:

Understanding different methods of deploying and serving models in the cloud.
Serving Keras and TensorFlow models with TensorFlow-Serving
Deploying TensorFlow-Serving to Kubernetes

Code: chapter-09-kubernetes

Kubeflow:

Using Kubeflow and KFServing for simplifying the deployment process

Code: chapter-09-kubeflow

Articles from mlbookcamp.com:

Appendices

Appendix A: Setting up the Environment

Installing Anaconda, a Python distribution that includes most of the scientific libraries we need
Running a Jupyter Notebook service from a remote machine
Installing and configuring the Kaggle command line interface tool for accessing datasets from Kaggle
Creating an EC2 machine on AWS using the web interface and the command-line interface

Code: no code

Articles from mlbookcamp.com:

Appendix B: Introduction to Python

Basic python syntax: variables and control-flow structures
Collections: lists, tuples, sets, and dictionaries
List comprehensions: a concise way of operating on collections
Reusability: functions, classes and importing code
Package management: using pip for installing libraries
Running python scripts

Code: appendix-b-python.ipynb

Articles from mlbookcamp.com:

Introduction to Python

Appendix C: Introduction to NumPy and Linear Algebra

One-dimensional and two-dimensional NumPy arrays
Generating NumPy arrays randomly
Operations with NumPy arrays: element-wise operations, summarizing operations, sorting and filtering
Multiplication in linear algebra: vector-vector, matrix-vector and matrix-matrix multiplications
Finding the inverse of a matrix and solving the normal equation

Code: appendix-c-numpy.ipynb

Articles from mlbookcamp.com:

Introduction to NumPy

Appendix C: Introduction to Pandas

The main data structures in Pandas: DataFrame and Series
Accessing rows and columns of a DataFrame
Element-wise and summarizing operations
Working with missing values
Sorting and grouping

Code: appendix-d-pandas.ipynb

Appendix D: AWS SageMaker

Increasing the GPU quota limits
Renting a Jupyter notebook with GPU in AWS SageMaker

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

14.5k Jan 7, 2023

Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

21 Jun 25, 2022

100 Days of Machine and Deep Learning Code

💯 Days of Machine Learning and Deep Learning Code MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Cluste

66 Nov 2, 2022

Turns your machine learning code into microservices with web API, interactive GUI, and more.

2.8k Jan 2, 2023

TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

A powerful and flexible machine learning platform for drug discovery

1.1k Jan 8, 2023

Machine learning template for projects based on sklearn library.

17 Oct 28, 2022

Predico Disease Prediction system based on symptoms provided by patient- using Python-Django & Machine Learning

1 Jan 6, 2022

Painless Machine Learning for python based on scikit-learn

PlainML Painless Machine Learning Library for python based on scikit-learn. Install pip install plainml Example from plainml import KnnModel, load_ir

1 Aug 6, 2022

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

366 Jan 3, 2023

Comments

Adding setup with docker
Hi @alexeygrigorev ,

I created a small guide for anyone who feels comfortable using Docker or might want to try it for setting up the environment.

Since I saw a couple of questions today related to environment setup, I thought of sharing what I usually use when working on projects or courses, then it can be re-usable.

Hoping is helpful :)

Changelog:

Updated readme with link to guide to create docker container

Added new guide to build docker container and run it

Added Dockerfile and environment.yml
opened by laurauzcategui 5
While converting keras to tflite error

While converting keras to tflite error :

raise ValueError('Unrecognized keyword arguments:', kwargs.keys()) ValueError: ('Unrecognized keyword arguments:', dict_keys(['ragged']))

Traceback (most recent call last): File "convert.py", line 5, in <module> model = keras.models.load_model('xception_v4_large_08_0.894.h5')

opened by saisubramani 5
notes correction in 06 Decision Trees...

Inside 02-data-prep.md , in the train/val/test split bullet note at the moment is : "Split the data with the distribution of 80% train, 20% validation, and 20% test sets with random seed to 11"

should be:

Split the data with the distribution of 60% train, 20% validation, and 20% test sets with random seed to 11

opened by lucapug 4
Update homework.md

Updated Question 4 text from "when one grows" to "when one grows up" and the F1 formula from "F1 = 2 * P * R / (P + R)" to "$$F1 = {2.}\frac{P . R}{P+R}$$"

opened by ukokobili 3

The code from the Machine Learning Bookcamp book and a free course based on the book

Related tags

Overview

Machine Learning Bookcamp

Machine Learning Zoomcamp

Reading Plan

Chapters

Chapter 1: Introduction to Machine Learning

Chapter 2: Machine Learning for Regression

Chapter 3: Machine Learning for Classification

Chapter 4: Evaluation Metrics for Classification

Chapter 5: Deploying Machine Learning Models

Chapter 6: Decision Trees and Ensemble Learning

Chapter 7: Neural Networks and Deep Learning

Chapter 8: Serverless Deep Learning

Chapter 9: Kubernetes and Kubeflow

Appendices

Appendix A: Setting up the Environment

Appendix B: Introduction to Python

Appendix C: Introduction to NumPy and Linear Algebra

Appendix C: Introduction to Pandas

Appendix D: AWS SageMaker

You might also like...

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Examples and code for the Practical Machine Learning workshop series

100 Days of Machine and Deep Learning Code

Turns your machine learning code into microservices with web API, interactive GUI, and more.

TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

Machine learning template for projects based on sklearn library.

Predico Disease Prediction system based on symptoms provided by patient- using Python-Django & Machine Learning

Painless Machine Learning for python based on scikit-learn

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Comments

Adding setup with docker

While converting keras to tflite error

notes correction in 06 Decision Trees...

Update homework.md

Releases(chapter7-model)

chapter7-model(Sep 26, 2020)

Owner

Alexey Grigorev

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

Time Series Prediction with tf.contrib.timeseries

Open MLOps - A Production-focused Open-Source Machine Learning Framework

This jupyter notebook project was completed by me and my friend using the dataset from Kaggle

Data science, Data manipulation and Machine learning package.

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application

AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker

Accelerating model creation and evaluation.

Learn Machine Learning Algorithms by doing projects in Python and R Programming Language

This is an auto-ML tool specialized in detecting of outliers

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Machine-care - A simple python script to take care of simple maintenance tasks

Library for machine learning stacking generalization.

CS 7301: Spring 2021 Course on Advanced Topics in Optimization in Machine Learning

fastFM: A Library for Factorization Machines

Classification based on Fuzzy Logic(C-Means).

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Client - 🔥 A tool for visualizing and tracking your machine learning experiments

This machine learning model was developed for House Prices

Python library which makes it possible to dynamically mask/anonymize data using JSON string or python dict rules in a PySpark environment.