Pytorch implementation of XRD spectral identification from COD database

Last update: Jan 07, 2023

Related tags

Overview

XRDidentifier

Pytorch implementation of XRD spectral identification from COD database.
Details will be explained in the paper to be submitted to NeurIPS 2021 Workshop Machine Learning and the Physical Sciences (https://ml4physicalsciences.github.io/2021/).

Features

expert model

1D-CNN (1D-RegNet) + Hierarchical Deep metric learning (AdaCos + Angular Penalty Softmax Loss)

mixture of experts

73 expert models tailered to general chemical elements with sparsely-gated layer

data augmentation

Physics-informed data augmentation

Requirements

Python 3.6
PyTorch 1.4
pymatgen
scikit-learn

Dataset Construction

In the paper, I used ICSD dataset, but it is forbidden to redistribute the CIFs followed by their license. I will write the CIF dataset construction method using COD instead.

1. download cif files from COD

Go to the COD homepage, search and download the cif URL list.
http://www.crystallography.net/cod/search.html

python3 download_cif_from_cod.py --input ./COD-selection.txt --output ./cif

2. convert cif into XRD spectra

First, check the cif files. (some files are broken or physically meaningless)

python3 read_cif.py --input ./cif --output ./lithium_datasets.pkl

lithium_datasets.pkl will be created.

Second, convert the checked results into XRD spectra database.

python3 convertXRDspectra.py --input ./lithium_datasets.pkl --batch 8 --n_aug 5

XRD_epoch5.pkl will be created.

Train expert models

python3 train_expert.py --input ./XRD_epoch5.pkl --output learning_curve.csv --batch 16 --n_epoch 100

Output data

Trained model -> regnet1d_adacos_epoch100.pt
Learning curve -> learning_curve.csv
Correspondence between numerical int label and crystal names -> material_labels.csv

Train Mixture-of-Experts model

You need to prepare both pre-trained expert models and pickled single XRD spectra files.
You should store the pre-trained expert models in './pretrained' folder, and the pickled single XRD spectra files in './pickles' folder.
The number of experts are automatically adjusted according to the number of the pretrained expert models.

python3 train_moe.py --data_path ./pickles --save_model moe.pt --batch 64 --epoch 100

Output data

Trained model -> moe.pt
Learning curve -> moe.csv

Citation

Papers

AdaCos: https://arxiv.org/abs/1905.00292
1D-RegNet: https://arxiv.org/abs/2008.04063
Physics-informed data augmentation: https://arxiv.org/abs/1811.08425v2
Sparsely-gated layer: https://arxiv.org/abs/1701.06538

Implementation

AdaCos: https://github.com/4uiiurz1/pytorch-adacos/blob/master/metrics.py
1D-RegNet: https://github.com/hsd1503/resnet1d
Physics-informed data augmentation: https://github.com/PV-Lab/autoXRD
Top k accuracy: https://gist.github.com/weiaicunzai/2a5ae6eac6712c70bde0630f3e76b77b
Angular Penalty Softmax Loss: https://github.com/cvqluu/Angular-Penalty-Softmax-Losses-Pytorch
Sparsely-gated layer: https://github.com/davidmrau/mixture-of-experts

Pytorch implementation of XRD spectral identification from COD database

Related tags

Overview

XRDidentifier

Features

expert model

mixture of experts

data augmentation

Requirements

Dataset Construction

1. download cif files from COD

2. convert cif into XRD spectra

Train expert models

Train Mixture-of-Experts model

Citation

Papers

Implementation

Owner

Masaki Adachi

PyVideoAI: Action Recognition Framework

Low-code/No-code approach for deep learning inference on devices

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

Parris, the automated infrastructure setup tool for machine learning algorithms.

Koç University deep learning framework.

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Google Brain - Ventilator Pressure Prediction

Multimodal Co-Attention Transformer (MCAT) for Survival Prediction in Gigapixel Whole Slide Images

Instance-based label smoothing for improving deep neural networks generalization and calibration

A quick recipe to learn all about Transformers

Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

A Python library for generating new text from existing samples.

The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

joint detection and semantic segmentation, based on ultralytics/yolov5,

Official Implementation of VAT

An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

MediaPipe is a an open-source framework from Google for building multimodal

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

This repo contains research materials released by members of the Google Brain team in Tokyo.