DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

Overview

ChemRxiv | [Paper] XXX

DeepStruc

Welcome to DeepStruc, a Deep Generative Model (DGM) that learns the relation between PDF and atomic structure and thereby solves a structure from a PDF!

  1. DeepStruc
  2. Getting started (with Colab)
  3. Getting started (own computer)
    1. Install requirements
    2. Simulate data
    3. Train model
    4. Predict
  4. Author
  5. Cite
  6. Acknowledgments
  7. License

We here apply DeepStruc for the structural analysis of a model system of mono-metallic nanoparticle (MMNPs) with seven different structure types and demonstrate the method for both simulated and experimental PDFs. DeepStruc can reconstruct simulated data with an average mean absolute error (MAE) of the atom xyz-coordinates on 0.093 ± 0.058 Å after fitting a contraction/extraction factor, an ADP and a scale parameter. We demonstrate the generative capability of DeepStruc on a dataset of face-centered cubic (fcc), hexagonal closed packed (hcp) and stacking faulted structures, where DeepStruc can recognize the stacking faulted structures as an interpolation between fcc and hcp and construct new structural models based on a PDF. The MAE is in this example 0.030 ± 0.019 Å.

The MMNPs are provided as a graph-based input to the encoder of DeepStruc. We compare DeepStruc with a similar DGM without the graph-based encoder. DeepStruc is able to reconstruct the structures using a smaller dimension of the latent space thus having a better generative capabillity. We also compare DeepStruc with a brute-force modelling approach and a tree-based classification algorithm. The ML models are significantly faster than the brute-force approach, but DeepStruc can furthermore create a latent space from where synthetic structures can be sampled which the tree-based method cannot! The baseline models can be found in other repositories: brute-force, MetalFinder and CVAE. alt text

Getting started (with Colab)

Using DeepStruc on your own PDFs is straightforward and does not require anything installed or downloaded to your computer. Follow the instructions in our Colab notebook and try to play around.

Getting started (own computer)

Follow these step if you want to train DeepStruc and predict with DeepStruc locally on your own computer.

Install requirements

See the install folder.

Simulate data

See the data folder.

Train model

To train your own DeepStruc model simply run:

python train.py

A list of possible arguments or run the '--help' argument for additional information.
If you are intersted in changing the architecture of the model go to train.py and change the model_arch dictionary.

Arg Description Example
-h or --help Prints help message.
-d or --data_dir Directory containing graph training, validation and test data. str -d ./data/graphs
-s or --save_dir Directory where models will be saved. This is also used for loading a learner. str -s bst_model
-r or --resume_model If 'True' the save_dir model is loaded and training is continued. bool -r True
-e or --epochs Number of maximum epochs. int -e 100
-b or --batch_size Number of graphs in each batch. int -b 20
-l or --learning_rate Learning rate. float -l 1e-4
-B or --beta Initial beta value for scaling KLD. float -B 0.1
-i or --beta_increase Increments of beta when the threshold is met. float -i 0.1
-x or --beta_max Highst value beta can increase to. float -x 5
-t or --reconstruction_th Reconstruction threshold required before beta is increased. float -t 0.001
-n or --num_files Total number of files loaded. Files will be split 60/20/20. If 'None' then all files are loaded. int -n 500
-c or --compute Train model on CPU or GPU. Choices: 'cpu', 'gpu16', 'gpu32' and 'gpu64'. str -c gpu32
-L or --latent_dim Number of latent space dimensions. int -L 3

Predict

To predict a MMNP using DeepStruc or your own model on a PDF:

python predict.py

A list of possible arguments or run the '--help' argument for additional information.

Arg Description Example
-h or --help Prints help message.
-d or --data Path to data or data directory. If pointing to data directory all datasets must have same format. str -d data/experimental_PDFs/JQ_S1.gr
-m or --model Path to model. If 'None' GUI will open. str -m ./models/DeepStruc
-n or --num_samples Number of samples/structures generated for each unique PDF. int -n 10
-s or --sigma Sample to '-s' sigma in the normal distribution. float -s 7
-p or --plot_sampling Plots sampled structures on top of DeepStruc training data. Model must be DeepStruc. bool -p True
-g or --save_path Path to directory where predictions will be saved. bool -g ./best_preds
-i or --index_plot Highlights specific reconstruction in the latent space. --data must be specific file and not directory and '--plot True'. int -i 4
-P or --plot_data If True then the first loaded PDF is plotted and shown after normalization. bool -P ./best_preds

Authors

Andy S. Anker1
Emil T. S. Kjær1
Marcus N. Weng1
Simon J. L. Billinge2, 3
Raghavendra Selvan4, 5
Kirsten M. Ø. Jensen1

1 Department of Chemistry and Nano-Science Center, University of Copenhagen, 2100 Copenhagen Ø, Denmark.
2 Department of Applied Physics and Applied Mathematics Science, Columbia University, New York, NY 10027, USA.
3 Condensed Matter Physics and Materials Science Department, Brookhaven National Laboratory, Upton, NY 11973, USA.
4 Department of Computer Science, University of Copenhagen, 2100 Copenhagen Ø, Denmark.
5 Department of Neuroscience, University of Copenhagen, 2200, Copenhagen N.

Should there be any question, desired improvement or bugs please contact us on GitHub or through email: [email protected] or [email protected].

Cite

If you use our code or our results, please consider citing our papers. Thanks in advance!

@article{kjær2022DeepStruc,
title={DeepStruc: Towards structure solution from pair distribution function data using deep generative models},
author={Emil T. S. Kjær, Andy S. Anker, Marcus N. Weng, Simon J. L. Billinge, Raghavendra Selvan, Kirsten M. Ø. Jensen},
year={2022}}
@article{anker2020characterising,
title={Characterising the atomic structure of mono-metallic nanoparticles from x-ray scattering data using conditional generative models},
author={Anker, Andy Sode and Kjær, Emil TS and Dam, Erik B and Billinge, Simon JL and Jensen, Kirsten MØ and Selvan, Raghavendra},
year={2020}}

Acknowledgments

Our code is developed based on the the following publication:

@article{anker2020characterising,
title={Characterising the atomic structure of mono-metallic nanoparticles from x-ray scattering data using conditional generative models},
author={Anker, Andy Sode and Kjær, Emil TS and Dam, Erik B and Billinge, Simon JL and Jensen, Kirsten MØ and Selvan, Raghavendra},
year={2020}}

License

This project is licensed under the Apache License Version 2.0, January 2004 - see the LICENSE file for details.

Owner
Emil Thyge Skaaning Kjær
Ph.D student in nanoscience at the University of Copenhagen.
Emil Thyge Skaaning Kjær
simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Ramón Casero 1 Jan 07, 2022
a practicable framework used in Deep Learning. So far UDL only provide DCFNet implementation for the ICCV paper (Dynamic Cross Feature Fusion for Remote Sensing Pansharpening)

UDL UDL is a practicable framework used in Deep Learning (computer vision). Benchmark codes, results and models are available in UDL, please contact @

Xiao Wu 11 Sep 30, 2022
An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

DTech.AIML An AI made using artificial intelligence (AI) and machine learning algorithms (ML) . This is created by help of some members in my team and

1 Jan 06, 2022
Open CV - Convert a picture to look like a cartoon sketch in python

Use the video https://www.youtube.com/watch?v=k7cVPGpnels for initial learning.

Sammith S Bharadwaj 3 Jan 29, 2022
CCCL: Contrastive Cascade Graph Learning.

CCGL: Contrastive Cascade Graph Learning This repo provides a reference implementation of Contrastive Cascade Graph Learning (CCGL) framework as descr

Xovee Xu 19 Dec 05, 2022
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

Soumik Rakshit 11 Jul 24, 2022
The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

ISC21-Descriptor-Track-1st The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track. You can check our solution

lyakaap 75 Jan 08, 2023
Large dataset storage format for Pytorch

H5Record Large dataset ( 100G, = 1T) storage format for Pytorch (wip) Support python 3 pip install h5record Why? Writing large dataset is still a

theblackcat102 43 Oct 22, 2022
Development Kit for the SoccerNet Challenge

SoccerNetv2-DevKit Welcome to the SoccerNet-V2 Development Kit for the SoccerNet Benchmark and Challenge. This kit is meant as a help to get started w

Silvio Giancola 117 Dec 30, 2022
🛠️ SLAMcore SLAM Utilities

slamcore_utils Description This repo contains the slamcore-setup-dataset script. It can be used for installing a sample dataset for offline testing an

SLAMcore 7 Aug 04, 2022
Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.

pixel_character_generator Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included. Dataset TinyHero D

Agnieszka Mikołajczyk 88 Nov 17, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Deep-Learning-Book-Chapter-Summaries This repository provides a summary for each chapter of the Deep Learning book by Ian Goodfellow, Yoshua Bengio an

Aman Dalmia 1k Dec 27, 2022
The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

DGMS This is the code of the paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks". Installation Our code works with Pytho

Runpei Dong 3 Aug 28, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

ISC-Track1-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 1. Required dependencies To begin with

Wenhao Wang 115 Jan 02, 2023
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
A simple algorithm for extracting tree height in sparse scene from point cloud data.

TREE HEIGHT EXTRACTION IN SPARSE SCENES BASED ON UAV REMOTE SENSING This is the offical python implementation of the paper "Tree Height Extraction in

6 Oct 28, 2022
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling Transformer-based models are widely used in natural language processi

Zhanpeng Zeng 12 Jan 01, 2023
Two-stage CenterNet

Probabilistic two-stage detection Two-stage object detectors that use class-agnostic one-stage detectors as the proposal network. Probabilistic two-st

Xingyi Zhou 1.1k Jan 03, 2023
Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Keras_cv_attention_models Keras_cv_attention_models Usage Basic Usage Layers Model surgery AotNet ResNetD ResNeXt ResNetQ BotNet VOLO ResNeSt HaloNet

319 Dec 28, 2022