Using knowledge-informed machine learning on the PRONOSTIA (FEMTO) and IMS bearing data sets. Predict remaining-useful-life (RUL).

Overview

Knowledge Informed Machine Learning using a Weibull-based Loss Function

Exploring the concept of knowledge-informed machine learning with the use of a Weibull-based loss function. Used to predict remaining useful life (RUL) on the IMS and PRONOSTIA (also called FEMTO) bearing data sets.

Open In Colab Source code arXiv

Knowledge-informed machine learning is used on the IMS and PRONOSTIA bearing data sets for remaining useful life (RUL) prediction. The knowledge is integrated into a neural network through a novel Weibull-based loss function. A thorough statistical analysis of the Weibull-based loss function is conducted, demonstrating the effectiveness of the method on the PRONOSTIA data set. However, the Weibull-based loss function is less effective on the IMS data set.

The experiment will be detailed in the Journal of Prognostics and Health Management (accepted and pending publication -- preprint here), with an extensive discussion on the results, shortcomings, and benefits analysis. The paper also gives an overview of knowledge informed machine learning as it applies to prognostics and health management (PHM).

You can replicate the work, and all figures, by following the instructions in the Setup section. Even easier: run the Colab notebook!

If you have any questions, leave a comment in the discussion, or email me ([email protected]).

Summary

In this work, we use the definition of knowledge informed machine learning from von Rueden et al. (their excellent paper is here). Here's the general taxonomy of our knowledge informed machine learning experiment:

source_rep_int

Bearing vibration data (from the frequency domain) was used as input to feed-forward neural networks. The below figure demonstrates the data as a spectrogram (a) and the spectrogram after "binning" (b). The binned data was used as input.

spectrogram

A large hyper-parameter search was conducted on neural networks. Nine different Weibull-based loss functions were tested on each unique network.

The below chart is a qualitative method of showing the effectiveness of the Weibull-based loss functions on the two data sets.

loss function percentage

We also conducted a statistical analysis of the results, as shown below.

correlation of the weibull-based loss function to results

The top performing models' RUL trends are shown below, for both the IMS and PRONOSTIA data sets.

IMS RUL  trend
PRONOSTIA RUL  trend

Setup

Tested in linux (MacOS should also work). If you run windows you'll have to do much of the environment setup and data download/preprocessing manually.

To reproduce results:

  1. Clone this repo - clone https://github.com/tvhahn/weibull-knowledge-informed-ml.git

  2. Create virtual environment. Assumes that Conda is installed.

    • Linux/MacOS: use command from the Makefile in the root directory - make create_environment
    • Windows: from root directory - conda env create -f envweibull.yml
    • HPC: make create_environment will detect HPC environment and automatically create environment from make_hpc_venv.sh. Tested on Compute Canada. Modify make_hpc_venv.sh for your own HPC cluster.
  3. Download raw data.

    • Linux/MacOS: use make download. Will automatically download to appropriate data/raw directory.
    • Windows: Manually download the the IMS and PRONOSTIA (FEMTO) data sets from NASA prognostics data repository. Put in data/raw folder.
    • HPC: use make download. Will automatically detect HPC environment.
  4. Extract raw data.

    • Linux/MacOS: use make extract. Will automatically extract to appropriate data/raw directory.
    • Windows: Manually extract data. See the Project Organization section for folder structure.
    • HPC: use make download. Will automatically detect HPC environment. Again, modify for your HPC cluster.
  5. Ensure virtual environment is activated. conda activate weibull or source ~/weibull/bin/activate

  6. From root directory of weibull-knowledge-informed-ml, run pip install -e . -- this will give the python scripts access to the src folders.

  7. Train!

    • Linux/MacOS: use make train_ims or make train_femto. Note: set constants in the makefile for changing random search parameters. Currently set as default.

    • Windows: run manually by calling the script - python train_ims or python train_femto with the appropriate arguments. For example: src/models/train_models.py --data_set femto --path_data your_data_path --proj_dir your_project_directory_path

    • HPC: use make train_ims or make train_femto. The HPC environment should be automatically detected. A SLURM script will be run for a batch job.

      • Modify the train_modify_ims_hpc.sh or train_model_femto_hpc.sh in the src/models directory to meet the needs of your HPC cluster. This should work on Compute Canada out of the box.
  8. Filter out the poorly performing models and collate the results. This will create several results files in the models/final folder.

    • Linux/MacOS: use make summarize_ims_models or make summarize_femto_models. (note: set filter boundaries in summarize_model_results.py. Will eventually modify for use with Argparse...)
    • Windows: run manually by calling the script.
    • HPC: use make summarize_ims_models or make summarize_femto_models. Again, change filter requirements in the summarize_model_results.py script.
  9. Make the figures of the data and results.

    • Linux/MacOS: use make figures_data and make figures_results. Figures will be generated and placed in the reports/figures folder.
    • Windows: run manually by calling the script.
    • HPC: use make figures_data and make figures_results

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands to reproduce work, lik `make data` or `make train_ims`
├── README.md          <- The top-level README.
├── data
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump. Downloaded from the NASA Prognostic repository.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details (nothing in here yet)
│
├── models             <- Trained models, model predictions, and model summaries
│   ├── interim        <- Intermediate models that have not analyzed. Output from the random search.
│   ├── final          <- Final models that have been filtered and summarized. Several outpu csv files as well.
│
├── notebooks          <- Jupyter notebooks used for data exploration and analysis. Of varying quality.
│   ├── scratch        <- Scratch notebooks for quick experimentation.     
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials (empty).
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── envweibull.yml    <- The Conda environment file for reproducing the analysis environment
│                        recommend using Conda).
│
├── make_hpc_venv.sh  <- Bash script to create the HPC venv. Setup for my Compute Canada cluster.
│                        Modify to suit your own HPC cluster.
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models               
│   │   └── predict_model.py
│   │
│   └── visualization  <- Scripts to create figures of the data, results, and training progress
│       ├── visualize_data.py       
│       ├── visualize_results.py     
│       └── visualize_training.py    

Future List

As noted in the paper, the best thing would be to test out Weibull-based loss functions on large, and real-world, industrial datasets. Suitable applications may include large fleets of pumps or gas turbines.

Owner
Tim
Data science. Innovation. ML practitioner.
Tim
Hepsiburada - Hepsiburada Urun Bilgisi Cekme

Hepsiburada Urun Bilgisi Cekme from hepsiburada import Marka nike = Marka("nike"

Ilker Manap 8 Oct 26, 2022
Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Adaptive Methods for Aggregated Domain Generalization (AdaClust) Official Pytorch Implementation of Adaptive Methods for Aggregated Domain Generalizat

Xavier Thomas 15 Sep 20, 2022
Public Implementation of ChIRo from "Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations"

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations This directory contains the model architectures and experimental

35 Dec 05, 2022
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

20 May 28, 2022
这是一个利用facenet和retinaface实现人脸识别的库,可以进行在线的人脸识别。

Facenet+Retinaface:人脸识别模型在Keras当中的实现 目录 注意事项 Attention 所需环境 Environment 文件下载 Download 预测步骤 How2predict 参考资料 Reference 注意事项 该库中包含了两个网络,分别是retinaface和fa

Bubbliiiing 31 Nov 15, 2022
A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

A-ESRGAN: Training Real-World Blind Super-Resolution with Attention-based U-net Discriminators The authors are hidden for the purpose of double blind

77 Dec 16, 2022
Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Created by Olga Sutyrina, Sarah Elemili, Abduragim Shtanchaev and Artur Bille Individual Tree Crown classification on WorldView-2 Images using Autoenc

2 Dec 08, 2022
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs We are trying hard to update the code, but it may take a while to complete due to our tight schedule rec

Shirley (Ying-Xin) Wu 47 Dec 16, 2022
Optimized code based on M2 for faster image captioning training

Transformer Captioning This repository contains the code for Transformer-based image captioning. Based on meshed-memory-transformer, we further optimi

lyricpoem 16 Dec 16, 2022
Emotion Recognition from Facial Images

Reconhecimento de Emoções a partir de imagens faciais Este projeto implementa um classificador simples que utiliza técncias de deep learning e transfe

Gabriel 2 Feb 09, 2022
A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

sam4onnx A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for

Katsuya Hyodo 6 May 15, 2022
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

Ruotian(RT) Luo 906 Jan 03, 2023
Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG) 97 Jan 03, 2023
Stacked Generative Adversarial Networks

Stacked Generative Adversarial Networks This repository contains code for the paper "Stacked Generative Adversarial Networks", CVPR 2017. Part of the

Xun Huang 241 May 07, 2022
(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

Yue Zhao 6.6k Jan 03, 2023
Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

fcn - Fully Convolutional Networks Chainer implementation of Fully Convolutional Networks. Installation pip install fcn Inference Inference is done as

Kentaro Wada 218 Oct 27, 2022
The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

32 Jun 14, 2022
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used

IMLHF 3 Oct 11, 2022
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

ENet in Caffe Execution times and hardware requirements Network 1024x512 1280x720 Parameters Model size (fp32) ENet 20.4 ms 32.9 ms 0.36 M 1.5 MB SegN

Timo Sämann 561 Jan 04, 2023
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022