Active Learning demo using two small datasets

Last update: Nov 10, 2021

Related tags

Data Analysis ActiveLearningDemo

Overview

ActiveLearningDemo

How to run

step one

put the dataset folder and use command below to split the dataset to the required structure

run utils.py

For each dataset, six .mat documents should be included: TrainingMatrix.mat, TrainingLabels.mat, TestingMatrix.mat, TestingLabels.mat, UnlabeledMatrix.mat and UnlabeledLabels.mat.

step two

Train the model. You can set arguments:

Active learning

optional arguments:
  -h, --help            show this help message and exit
  --src SRC             dataset path
  --dst DST             destination path
  --type TYPE           sample strategy:random, entropy, combine
  --solver SOLVER       model solver
  --max_iter MAX_ITER   max iteration of each training
  --k K                 samele added for each iteration
  --n N                 number of iterations
  --plot_type PLOT_TYPE
                        plot single for one case(single) or plot average for
                        entire database(average)

You can utilize both one dataset with multiple subsets inside and one case of a dataset with only six .mat documents. By default, I used "newton-cg" solver and "combine" type which can train model with both strategies at once. To get results on different datasets directly, you can use:

python main.py --src your dataset path(./datasets/MMI) --dst output path(./img)

Result

MMI dataset

use "lbfgs" solver:

use "newton-cg" solver:

MindReading dataset

use "lbfgs" solver:

use "newton-cg" solver:

Active Learning demo using two small datasets

Related tags

Overview

ActiveLearningDemo

How to run

Result

Owner

Candlestick Pattern Recognition with Python and TA-Lib

A lightweight interface for reading in output from the Weather Research and Forecasting (WRF) model into xarray Dataset

CPSPEC is an astrophysical data reduction software for timing

INF42 - Topological Data Analysis

Demonstrate a Dataflow pipeline that saves data from an API into BigQuery table

Snakemake workflow for converting FASTQ files to self-contained CRAM files with maximum lossless compression.

Clean and reusable data-sciency notebooks.

Retail-Sim is python package to easily create synthetic dataset of retaile store.

A neural-based binary analysis tool

This tool parses log data and allows to define analysis pipelines for anomaly detection.

The micro-framework to create dataframes from functions.

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

DefAP is a program developed to facilitate the exploration of a material's defect chemistry

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.

Weather Image Recognition - Python weather application using series of data

Random dataframe and database table generator

Data processing with Pandas.

Intake is a lightweight package for finding, investigating, loading and disseminating data.

PyPSA: Python for Power System Analysis