Public Implementation of ChIRo from "Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations"

Last update: Dec 05, 2022

Related tags

Deep Learning ChIRo

Overview

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations

This directory contains the model architectures and experimental setups used for ChIRo, SchNet, DimeNet++, and SphereNet on the four tasks considered in the preprint:

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations

These four tasks are:

Contrastive learning to cluster conformers of different stereoisomers in a learned latent space
Classification of chiral centers as R/S
Classification of the sign (+/-; l/d) of rotated circularly polarized light
Ranking enantiomers by their docking scores in an enantiosensitive protein pocket.

The exact data splits used for tasks (1), (2), and (4) can be downloaded from:

https://figshare.com/s/e23be65a884ce7fc8543

See the appendix of "Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations" for details on how the datasets for task (3) were extracted and filtered from the commercial Reaxys database.

This directory is organized as follows:

Subdirectory model/ contains the implementation of ChIRo.
- model/alpha_encoder.py contains the network architecture of ChIRo
- model/embedding_functions.py contains the featurization of the input conformers (RDKit mol objects) for ChIRo.
- model/datasets_samplers.py contains the Pytorch / Pytorch Geometric data samplers used for sampling conformers in each training batch.
- model/train_functions.py and model/train_models.py contain supporting training/inference loops for each experiment with ChIRo.
- model/optimization_functions.py contains the loss functions used in the experiments with ChIRo.
- Subdirectory model/gnn_3D/ contains the implementations of SchNet, DimeNet++, and SphereNet used for each experiment.
  - model/gnn_3D/schnet.py contains the publicly available code for SchNet, with adaptations for readout.
  - model/gnn_3D/dimenet_pp.py contains the publicly available code for DimeNet++, with adaptations for readout.
  - model/gnn_3D/spherenet.py contains the publicly available code for SphereNet, with adaptations for readout.
  - model/gnn_3D/train_functions.py and model/gnn_3D/train_models.py contain the training/inference loops for each experiment with SchNet, DimeNet++, or SphereNet.
  - model/gnn_3D/optimization_functions.py contains the loss functions used in the experiments with SchNet, DimeNet++, or SphereNet.
Subdirectory params_files/ contains the hyperparameters used to define exact network initializations for ChIRo, SchNet, DimeNet++, and SphereNet for each experiment. The parameter .json files are specified with a random seed = 1, and the first fold of cross validation for the l/d classifcation task. For the experiments specified in the paper, we use random seeds = 1,2,3 when repeating experiments across three training/test trials.
Subdirectory training_scripts/ contains the python scripts to run each of the four experiments, for each of the four 3D models ChIRo, SchNet, DimeNet++, and SphereNet. Before running each experiment, move the corresponding training script to the parent directory.
Subdirectory hyperopt/ contains hyperparameter optimization scripts for ChIRo using Raytune.
Subdirectory experiment_analysis/ contains jupyter notebooks for analyzing results of each experiment.
Subdirectory paper_results/ contains the parameter files, model parameter dictionaries, and loss curves for each experiment reported in the paper.

To run each experiment, first create a conda environment with the following dependencies:

python = 3.8.6
pytorch = 1.7.0
torchaudio = 0.7.0
torchvision = 0.8.1
torch-geometric = 1.6.3
torch-cluster = 1.5.8
torch-scatter = 2.0.5
torch-sparce = 0.6.8
torch-spline-conv = 1.2.1
numpy = 1.19.2
pandas = 1.1.3
rdkit = 2020.09.4
scikit-learn = 0.23.2
matplotlib = 3.3.3
scipy = 1.5.2
sympy = 1.8
tqdm = 4.58.0

Then, download the datasets (with exact training/validation/test splits) from https://figshare.com/s/e23be65a884ce7fc8543 and place them in a new directory final_data_splits/

You may then run each experiment by calling:

python training_{experiment}_{model}.py params_files/params_{experiment}_{model}.json {path_to_results_directory}/

For instance, you can run the docking experiment for ChIRo with a random seed of 1 (editable in the params .json file) by calling:

python training_binary_ranking.py params_files/params_binary_ranking_ChIRo.json results_binary_ranking_ChIRo/

After training, this will create a results directory containing model checkpoints, best model parameter dictionaries, and results on the test set (if applicable).

Public Implementation of ChIRo from "Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations"

Related tags

Overview

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations

Owner

Google Recaptcha solver.

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Machine learning algorithms for many-body quantum systems

A python comtrade load library accelerated by go

Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Code repo for "FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation" (ICCV 2021)

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Solutions and questions for AoC2021. Merry christmas!

A library for graph deep learning research

Neural Style and MSG-Net

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

Official git repo for the CHIRP project

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

SCAAML is a deep learning framwork dedicated to side-channel attacks run on top of TensorFlow 2.x.

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

Public Implementation of ChIRo from "Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations"

Related tags

Overview

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations

Owner

Google Recaptcha solver.

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Machine learning algorithms for many-body quantum systems

A python comtrade load library accelerated by go

Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Code repo for "FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation" (ICCV 2021)

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Solutions and questions for AoC2021. Merry christmas!

A library for graph deep learning research

Neural Style and MSG-Net

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

Official git repo for the CHIRP project

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

SCAAML is a deep learning framwork dedicated to side-channel attacks run on top of TensorFlow 2.x.

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.