Machine Learning Course with Python:

Overview

A Machine Learning Course with Python

https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat https://badges.frapsoft.com/os/v2/open-source.png?v=103 https://img.shields.io/twitter/follow/machinemindset.svg?label=Follow&style=social

Table of Contents

Download Free Deep Learning Resource Guide

Slack Group

Introduction

The purpose of this project is to provide a comprehensive and yet simple course in Machine Learning using Python.

Motivation

Machine Learning, as a tool for Artificial Intelligence, is one of the most widely adopted scientific fields. A considerable amount of literature has been published on Machine Learning. The purpose of this project is to provide the most important aspects of Machine Learning by presenting a series of simple and yet comprehensive tutorials using Python. In this project, we built our tutorials using many different well-known Machine Learning frameworks such as Scikit-learn. In this project you will learn:

  • What is the definition of Machine Learning?
  • When it started and what is the trending evolution?
  • What are the Machine Learning categories and subcategories?
  • What are the mostly used Machine Learning algorithms and how to implement them?

Machine Learning

Title Document
An Introduction to Machine Learning Overview

Machine Learning Basics

_img/intro.png
Title Code Document
Linear Regression Python Tutorial
Overfitting / Underfitting Python Tutorial
Regularization Python Tutorial
Cross-Validation Python Tutorial

Supervised learning

_img/supervised.gif
Title Code Document
Decision Trees Python Tutorial
K-Nearest Neighbors Python Tutorial
Naive Bayes Python Tutorial
Logistic Regression Python Tutorial
Support Vector Machines Python Tutorial

Unsupervised learning

_img/unsupervised.gif
Title Code Document
Clustering Python Tutorial
Principal Components Analysis Python Tutorial

Deep Learning

_img/deeplearning.png
Title Code Document
Neural Networks Overview Python Tutorial
Convolutional Neural Networks Python Tutorial
Autoencoders Python Tutorial
Recurrent Neural Networks Python IPython

Pull Request Process

Please consider the following criterions in order to help us in a better way:

  1. The pull request is mainly expected to be a link suggestion.
  2. Please make sure your suggested resources are not obsolete or broken.
  3. Ensure any install or build dependencies are removed before the end of the layer when doing a build and creating a pull request.
  4. Add comments with details of changes to the interface, this includes new environment variables, exposed ports, useful file locations and container parameters.
  5. You may merge the Pull Request in once you have the sign-off of at least one other developer, or if you do not have permission to do that, you may request the owner to merge it for you if you believe all checks are passed.

Final Note

We are looking forward to your kind feedback. Please help us to improve this open source project and make our work better. For contribution, please create a pull request and we will investigate it promptly. Once again, we appreciate your kind feedback and support.

Developers

Creator: Machine Learning Mindset [Blog, GitHub, Twitter]

Supervisor: Amirsina Torfi [GitHub, Personal Website, Linkedin ]

Developers: Brendan Sherman*, James E Hopkins* [Linkedin], Zac Smith [Linkedin]

NOTE: This project has been developed as a capstone project offered by [CS 4624 Multimedia/ Hypertext course at Virginia Tech] and Supervised and supported by [Machine Learning Mindset].

*: equally contributed

Citation

If you found this course useful, please kindly consider citing it as below:

@software{amirsina_torfi_2019_3585763,
  author       = {Amirsina Torfi and
                  Brendan Sherman and
                  Jay Hopkins and
                  Eric Wynn and
                  hokie45 and
                  Frederik De Bleser and
                  李明岳 and
                  Samuel Husso and
                  Alain},
  title        = {{machinelearningmindset/machine-learning-course:
                   Machine Learning with Python}},
  month        = dec,
  year         = 2019,
  publisher    = {Zenodo},
  version      = {1.0},
  doi          = {10.5281/zenodo.3585763},
  url          = {https://doi.org/10.5281/zenodo.3585763}
}
Comments
  • OF and LR updates

    OF and LR updates

    Taking into account review notes. Having trouble setting up my python environment, so I have not been able to test the code yet. I hope to fix that today/tomorrow. Fixed the table in LR.

    opened by BroccoliHijinx 11
  • Multilayer Perceptron write-up

    Multilayer Perceptron write-up

    Submitting a PR now to allow for comments on what is done. There are placeholders for what is left to be done, and I should be able to do that tomorrow.

    Left to do;

    images and associated text

    More on backprop

    Defining and explaining actual MLPs (most right now is on NN basics)

    opened by BroccoliHijinx 3
  • Addressed comments brought up in peer review

    Addressed comments brought up in peer review

    I decided to remove the multiple linear regression section because it seems beyond the scope of this module. Those images, MLR.png and MLR_POBF.png, can safely be removed from our image folder. I left a mention to it for completeness. I also added captions for all figures and equations to explain what they are.

    opened by b-sherman 3
  • Logistic Regression Files, some overfitting changes

    Logistic Regression Files, some overfitting changes

    Within Logistic Regression, I have a table that I cannot get working. I want to keep messing around with it, but I'm not sure what is wrong. I am using the rst basic table, but I think the spacing is off somehow.

    opened by BroccoliHijinx 3
  • Naive bayes question

    Naive bayes question

    Hi @astorfi , Thanks for your great work ! I'm a beginner of ML. Tonight when I learn Naive Bayes Classification in your tutorial, I found the Equation 1 in the tutorial is different from that in Wiki. I wonder which one is correct or both of them are right?

    image


    image

    Look forward to your reply.

    opened by suedroplet 2
  • Chinese Translation

    Chinese Translation

    Hi @astorfi , Thanks for your great work ! My friends and I have learned a lot here. China has a platform called KESCI (https://www.kesci.com). They provide algorithm competition opportunities for developers, which is similar to Kaggle, and self - training online environment to enhance their algorithmic ability. I am going to translate the whole series to Chinese and applied for a column to publish them on KESCI, as a series. Hope to get your permission. thanks.

    opened by Vivian0210 2
  • Overfitting rst file

    Overfitting rst file

    I don't think including code with this module makes much sense, so I just included a write-up. I tried to keep it short and simple, since this is something to keep in mind in the entire course.

    opened by BroccoliHijinx 2
  • Naive bayes

    Naive bayes

    I just created a new branch for the updated naive bayes files since the old one is very far behind now. Included are the images, code, and module text.

    opened by b-sherman 1
  • Linear regression

    Linear regression

    I redid all the linear regression code with a completely new data set to assure originality and because the existing scikit-learn ones are confusing to me so they are bound to be confusing to a new reader. I also changed all the images to reflect the new code. I tried to simplify the code as much as possible and only used the bare minimum number of references to scikit-learn functions. I also revised the rst document to reflect these changes. All generated images now have a link to the code I used to create them as well because it seemed like a good idea.

    opened by b-sherman 1
  • Updated linear_regression.rst

    Updated linear_regression.rst

    +Added a Motivation section that talks about what the problem is +Changed raw URLs into hyperlinks on smaller words +Added a Code section that links to the module code and talks about what it does +Added a Conclusion section to close out the module

    opened by b-sherman 1
  • Reference fixes

    Reference fixes

    Changed the "References" indent level in several modules to be consistent. Changed header casing in some modules to be consistent. Requesting merge so that the site can be updated for screenshots to include in the final project report.

    opened by b-sherman 0
Releases(1.0)
Owner
Instill AI
A company offering AI-based solutions to real-world applications.
Instill AI
Continuously evaluated, functional, incremental, time-series forecasting

timemachines Autonomous, univariate, k-step ahead time-series forecasting functions assigned Elo ratings You can: Use some of the functionality of a s

Peter Cotton 343 Jan 04, 2023
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin

Microsoft 8.4k Dec 30, 2022
We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

Salary-Prediction-with-Machine-Learning 1. Business Problem Can a machine learning project be implemented to estimate the salaries of baseball players

Ayşe Nur Türkaslan 9 Oct 14, 2022
Steganography is the art of hiding the fact that communication is taking place, by hiding information in other information.

Steganography is the art of hiding the fact that communication is taking place, by hiding information in other information.

Priyansh Sharma 7 Nov 09, 2022
Tribuo - A Java machine learning library

Tribuo - A Java prediction library (v4.1) Tribuo is a machine learning library in Java that provides multi-class classification, regression, clusterin

Oracle 1.1k Dec 28, 2022
Visualize classified time series data with interactive Sankey plots in Google Earth Engine

sankee Visualize changes in classified time series data with interactive Sankey plots in Google Earth Engine Contents Description Installation Using P

Aaron Zuspan 76 Dec 15, 2022
Distributed Computing for AI Made Simple

Project Home Blog Documents Paper Media Coverage Join Fiber users email list Uber Open Source 997 Dec 30, 2022

Optimal Randomized Canonical Correlation Analysis

ORCCA Optimal Randomized Canonical Correlation Analysis This project is for the python version of ORCCA algorithm. It depends on Numpy for matrix calc

Yinsong Wang 1 Nov 21, 2021
Kaggle Competition using 15 numerical predictors to predict a continuous outcome.

Kaggle-Comp.-Data-Mining Kaggle Competition using 15 numerical predictors to predict a continuous outcome as part of a final project for a stats data

moisey alaev 1 Dec 28, 2021
Python-based implementations of algorithms for learning on imbalanced data.

ND DIAL: Imbalanced Algorithms Minimalist Python-based implementations of algorithms for imbalanced learning. Includes deep and representational learn

DIAL | Notre Dame 220 Dec 13, 2022
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning.

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported ha

Microsoft 1.1k Jan 04, 2023
Quantum Machine Learning

The Machine Learning package simply contains sample datasets at present. It has some classification algorithms such as QSVM and VQC (Variational Quantum Classifier), where this data can be used for e

Qiskit 364 Jan 08, 2023
Self Organising Map (SOM) for clustering of atomistic samples through unsupervised learning.

Self Organising Map for Clustering of Atomistic Samples - V2 Description Self Organising Map (also known as Kohonen Network) implemented in Python for

Franco Aquistapace 0 Nov 16, 2021
Dragonfly is an open source python library for scalable Bayesian optimisation.

Dragonfly is an open source python library for scalable Bayesian optimisation. Bayesian optimisation is used for optimising black-box functions whose

744 Jan 02, 2023
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 03, 2023
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
List of Data Science Cheatsheets to rule the world

Data Science Cheatsheets List of Data Science Cheatsheets to rule the world. Table of Contents Business Science Business Science Problem Framework Dat

Favio André Vázquez 11.7k Dec 30, 2022
GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

Generator of Rad Names from Decent Paper Acronyms

264 Nov 08, 2022
scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022