Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Overview

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks

License: MIT

Overview

Unsupervised phone and word segmentation on speech data is performed. The experiments are described in:

  • H. Kamper, "Word segmentation on discovered phone units with dynamic programming and self-supervised scoring," arXiv preprint arXiv:2202.11929, 2022. [arXiv]
  • H. Kamper and B. van Niekerk, "Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networks," in Proc. Interspeech, 2021. [arXiv]

Please cite these papers if you use the code.

Dependencies

Dependencies can be installed in a conda environment:

conda env create -f environment.yml
conda activate dpdp

This does not include wordseg, which should be installed in its own environment according to its documentation.

Install the DPDP AE-RNN package:

git clone https://github.com/kamperh/dpdp_aernn.git ../dpdp_aernn

Minimal usage example: DPDP AE-RNN with DPDP CPC+K-means on Buckeye

In the sections that follow I give more complete details. In this section I briefly outline the sequence of steps that should reproduce the DPDP system results on Buckeye given in the paper. To apply the approach on other datasets you will need to carefully work through the subsequent sections, but I hope that this current section helps you to get going.

  1. Obtain the ground truth alignments for Buckeye provided in buckeye.zip as part of this release. Extract it into data/. There should now be a data/buckeye/ directory with the alignments.

  2. Extract CPC+K-means features for Buckeye. Do this by following the steps in the CPC-big subsection below.

  3. Perform acoustic unit discovery using DPDP CPC+K-means:

    ./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 \
        --input_format=txt --algorithm=dp_penalized cpc_big buckeye val
    
  4. Perform word segmentation on the discovered units using the DPDP AE-RNN:

    ./vq_wordseg.py --algorithm=dpdp_aernn \
        cpc_big buckeye val phoneseg_dp_penalized
    
  5. Evaluate the segmentation:

    ./eval_segmentation.py cpc_big buckeye val \
        wordseg_dpdp_aernn_dp_penalized
    

The result should correspond approximately to the following on the Buckeye validation data:

---------------------------------------------------------------------------
Word boundaries:
Precision: 35.80%
Recall: 36.30%
F-score: 36.05%
OS: 1.40%
R-value: 45.13%
---------------------------------------------------------------------------
Word token boundaries:
Precision: 23.93%
Recall: 24.23%
F-score: 24.08%
OS: 1.24%
---------------------------------------------------------------------------

Example encodings: CPC-big features on Buckeye

Install the ZeroSpeech 2021 baseline system from my fork by following the steps in the installation section of the readme. Make sure that vqwordseg/ (this repository) and zerospeech2021_baseline/ are in the same directory.

From the vqwordseg/ directory, move to the ZeroSpeech 2021 directory:

cd ../zerospeech2021_baseline/

Extract individual Buckeye wav files:

./get_buckeye_wavs.py ../datasets/buckeye/

The argument should point to your local copy of Buckeye.

Encode the Buckeye data:

conda activate zerospeech2021_baseline
./encode.py wav/buckeye/val/ exp/buckeye/val/
./encode.py wav/buckeye/test/ exp/buckeye/test/

Move back and deactivate the environment:

cd ../vqwordseg/
conda deactivate

Dataset format and directory structure

This code should be usable with any dataset given that alignments and VQ encodings are provided.

For evaluation you need the ground truth phone and (optionally) word boundaries. These should be stored in the directories data/<dataset>/phone_intervals/ and data/<dataset>/word_intervals/ using the following filename format:

<speaker>_<utterance_id>_<start_frame>-<end_frame>.txt

E.g., data/buckeye/phone_intervals/s01_01a_003222-003256.txt could consist of:

0 5 hh
5 10 iy
10 15 jh
15 19 ih
19 27 s
27 34 s
34 46 iy
46 54 m
54 65 z
65 69 l
69 78 ay
78 88 k

The duration-penalized dynamic programming (DPDP) algorithms operate on the output vector quantized (VQ) models. The (pre-)quantized representations and code indices should be provided in the exp/ directory. These are used as input to the VQ-segmentation algorithms; the segmented output is also produced in exp/.

As an example, the directory exp/vqcpc/buckeye/ should contain a file embedding.npy, which is the codebook matrix for a VQ-CPC model trained on Buckeye. This matrix will have the shape [n_codes, code_dim]. The directory exp/vqcpc/buckeye/val/ needs to contain at least subdirectories for the encoded validation set:

  • prequant/
  • indices/

The prequant/ directory contains the encodings from the VQ model before quantization. These encodings are given as text files with an embedding per line, e.g. the first three lines of prequant/s01_01a_003222-003256.txt could be:

 0.1601707935333252 -0.0403369292616844  0.4687763750553131 ...
 0.4489639401435852  1.3353070020675659  1.0353083610534668 ...
-1.0552909374237061  0.6382007002830505  4.5256714820861816 ...

The indices/ directory contains the code indices to which the auxiliary embeddings are actually mapped, i.e. which of the codes in embedding.npy are closest (under some metric) to the pre-quantized embedding. The code indices are again given as text files, with each index on a new line, e.g. the first three lines of indices/s01_01a_003222-003256.txt could be:

423
381
119
...

Any VQ model can be used. In the preceding section section I gave an example of using CPC-big with K-means; in the section below I give an example of how VQ-VAE and VQ-CPC can be used to obtain codes for the Buckeye dataset. In the subsequent section DPDP segmentation is described.

Example encodings: VQ-VAE and VQ-CPC on Buckeye

You can obtain the VQ input representations using the file format indicated above. As an example, here I describe how I did it for the Buckeye data.

First the following repositories need to be installed with their dependencies:

If you made sure that the dependencies are satisfied, these packages can be installed locally by running ./install_local.sh.

Change directory to ../VectorQuantizedCPC and then perform the following steps there. Pre-process audio and extract log-Mel spectrograms:

./preprocess.py in_dir=../datasets/buckeye/ dataset=buckeye

Encode the data and write it to the vqwordseg/exp/ directory. This should be performed for all splits (train, val and test):

./encode.py checkpoint=checkpoints/cpc/english2019/model.ckpt-22000.pt \
    split=val \
    save_indices=True \
    save_auxiliary=True \
    save_embedding=../vqwordseg/exp/vqcpc/buckeye/embedding.npy \
    out_dir=../vqwordseg/exp/vqcpc/buckeye/val/ \
    dataset=buckeye

Change directory to ../VectorQuantizedVAE and then run the following there. The audio can be pre-processed again (as above), or alternatively you can simply link to the audio from VectorQuantizedCPC/:

ln -s ../VectorQuantizedCPC/datasets/ .

Encode the data and write it to the vqwordseg/exp/ directory. This should be performed for all splits (train, val and test):

# Buckeye
./encode.py checkpoint=checkpoints/2019english/model.ckpt-500000.pt \
    split=train \
    save_indices=True \
    save_auxiliary=True \
    save_embedding=../vqwordseg/exp/vqvae/buckeye/embedding.npy \
    out_dir=../vqwordseg/exp/vqvae/buckeye/train/ \
    dataset=buckeye

You can delete all the created auxiliary_embedding1/ and codes/ directories since these are not used for segmentation.

Phone segmentation

DP penalized segmentation:

# Buckeye (GMM)
./vq_phoneseg.py --downsample_factor 1 --input_format=npy \
    --algorithm=dp_penalized --dur_weight 0.001 \
    gmm buckeye val --output_tag phoneseg_merge

# Buckeye (VQ-CPC)
./vq_phoneseg.py --input_format=txt --algorithm=dp_penalized \
    vqcpc buckeye val

# Buckeye (VQ-VAE)
./vq_phoneseg.py vqvae buckeye val

# Buckeye (CPC-big)
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big buckeye val

# Buckeye (CPC-big) HSMM
./vq_phoneseg.py --algorithm dp_penalized_hsmm --downsample_factor 1 \
    --dur_weight 1.0 --model_eos --dur_weight_func neg_log_gamma \
    --output_tag=phoneseg_hsmm_tune cpc_big buckeye val

# Buckeye Felix split (CPC-big) HSMM
./vq_phoneseg.py --algorithm dp_penalized_hsmm --downsample_factor 1 \
    --dur_weight 1.0 --model_eos --dur_weight_func neg_log_gamma \
    --output_tag=phoneseg_hsmm_tune cpc_big buckeye_felix test

# Xitsonga (CPC-big)
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big xitsonga train

# Buckeye (XLSR)
./vq_phoneseg.py --downsample_factor 2 --dur_weight 2500 \
    --input_format=npy --algorithm=dp_penalized xlsr buckeye val

# Buckeye (ResDAVEnet-VQ)
./vq_phoneseg.py --downsample_factor 2 --dur_weight 3 --input_format=txt \
    --algorithm=dp_penalized resdavenet_vq buckeye val

# Buckeye (ResDAVEnet-VQ3)
./vq_phoneseg.py --downsample_factor 4 --dur_weight 0.001 \
    --input_format=txt --algorithm=dp_penalized resdavenet_vq_quant3 \
    buckeye val --output_tag=phoneseg_merge

# Buckeye Felix split (VQ-VAE)
./vq_phoneseg.py --output_tag=phoneseg_dp_penalized \
    vqvae buckeye_felix test

# Buckeye Felix split (CPC-big)
./vq_phoneseg.py  --downsample_factor 1 --dur_weight 2 \
    --output_tag=phoneseg_dp_penalized_tune cpc_big buckeye_felix val

# Buckeye Felix split (VQ-VAE) with Poisson duration prior
./vq_phoneseg.py --output_tag=phoneseg_dp_penalized_poisson \
    --dur_weight_func neg_log_poisson --dur_weight 2 \
    vqvae buckeye_felix val

# Buckeye (VQ-VAE) with Gamma duration prior
./vq_phoneseg.py --output_tag=phoneseg_dp_penalized_gamma \
    --dur_weight_func neg_log_gamma --dur_weight 15 vqvae buckeye val

# ZeroSpeech'17 English (CPC-big)
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big zs2017_en train

# ZeroSpeech'17 French (CPC-big)
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big zs2017_fr train

# ZeroSpeech'17 Mandarin (CPC-big)
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big zs2017_zh train

# ZeroSpeech'17 French (XLSR)
./vq_phoneseg.py --downsample_factor 2 --dur_weight 1500 \
    --input_format=npy --algorithm=dp_penalized xlsr zs2017_fr train

# ZeroSpeech'17 Mandarin (XLSR)
./vq_phoneseg.py --downsample_factor 2 --dur_weight 2500 \
    --input_format=npy --algorithm=dp_penalized xlsr zs2017_zh train

# ZeroSpeech'17 Lang2 (CPC-big)
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big zs2017_lang2 train

DP penalized N-seg. segmentation:

# Buckeye Felix split (VQ-VAE)
./vq_phoneseg.py --algorithm=dp_penalized_n_seg \
    --n_frames_per_segment=3 --n_min_segments=3 vqvae buckeye_felix test

Evaluate segmentation:

# Buckeye (VQ-VAE)
./eval_segmentation.py vqvae buckeye val phoneseg_dp_penalized_n_seg

# Buckeye (CPC-big)
./eval_segmentation.py cpc_big buckeye val phoneseg_dp_penalized

Word segmentation

Word segmentation are performed on the segmented phone sequences.

Adaptor grammar word segmentation:

conda activate wordseg
# Buckeye (VQ-VAE)
./vq_wordseg.py --algorithm=ag vqvae buckeye val phoneseg_dp_penalized

# Buckeye (CPC-big)
./vq_wordseg.py --algorithm=ag cpc_big buckeye val phoneseg_dp_penalized

DPDP AE-RNN word segmentation:

# Buckeye (GMM)
./vq_wordseg.py --dur_weight=6 --algorithm=dpdp_aernn \
    gmm buckeye val phoneseg_dp_penalized

# Buckeye (CPC-big)
./vq_wordseg.py --algorithm=dpdp_aernn \
    cpc_big buckeye val phoneseg_dp_penalized

Evaluate the segmentation:

# Buckeye (VQ-VAE)
./eval_segmentation.py vqvae buckeye val wordseg_ag_dp_penalized

# Buckeye (CPC-big)
./eval_segmentation.py cpc_big buckeye val wordseg_ag_dp_penalized

Evaluate the segmentation with the ZeroSpeech tools:

./intervals_to_zs.py cpc_big zs2017_zh train wordseg_segaernn_dp_penalized
cd ../zerospeech2017_eval/
ln -s \
    /media/kamperh/endgame/projects/stellenbosch/vqseg/vqwordseg/exp/cpc_big/zs2017_zh/train/wordseg_dpdp_aernn_dp_penalized/clusters.txt \
    2017/track2/mandarin.txt
conda activate zerospeech2020_updated
zerospeech2020-evaluate 2017-track2 . -l mandarin -o mandarin.json

Analysis

Print the word clusters:

./clusters_print.py cpc_big buckeye val wordseg_ag_dp_penalized

Listen to segmented codes:

./cluster_wav.py vqvae buckeye val phoneseg_dp_penalized 343
./cluster_wav.py vqvae buckeye val wordseg_tp_dp_penalized 486_
./cluster_wav.py cpc_big buckeye val phoneseg_dp_penalized 50

This requires sox and that you change the path at the beginning of cluster_wav.py. For ZeroSpeech'17 data, use cluster_wav_zs2017.py instead.

Synthesize an utterance:

./indices_to_txt.py vqvae buckeye val phoneseg_dp_penalized \
    s18_03a_025476-025541
cd ../VectorQuantizedVAE
./synthesize_codes.py checkpoints/2019english/model.ckpt-500000.pt \
    ../vqwordseg/s18_03a_025476-025541.txt
cd -

Complete example on ZeroSpeech data

An example of phone and word segmentation on the surprise language.

Encode data:

cd ../zerospeech2021_baseline
conda activate pytorch
./get_wavs.py path_to_data/datasets/zerospeech2020/2020/2017/ \
    zs2017_lang1 train

conda activate zerospeech2021_baseline
./encode.py wav/zs2017_lang1/train/ exp/zs2017_lang1/train/

Phone segmentation:

cd ../vqwordseg
conda activate pytorch
# Create links in exp/cpc_big/
./vq_phoneseg.py --downsample_factor 1 --dur_weight 2 --input_format=txt \
    --algorithm=dp_penalized cpc_big zs2017_lang1 train
./cluster_wav_zs2017.py cpc_big zs2017_lang1 train phoneseg_dp_penalized 3

Word segmentation:

./vq_wordseg.py --algorithm=dpdp_aernn cpc_big zs2017_lang1 train \
    phoneseg_dp_penalized
./cluster_wav_zs2017.py cpc_big zs2017_lang1 train \
    wordseg_dpdp_aernn_dp_penalized 33_10_11_14_1_34_

Convert to ZeroSpeech format:

./intervals_to_zs.py cpc_big zs2017_lang1 train \
    wordseg_dpdp_aernn_dp_penalized

About the Buckeye data splits

The particular split of Buckeye that I use in this repository is a legacy split with a somewhat complicated history. But in short the test set is exactly the same one used in the ZeroSpeech 2015 challenge. The remaining speakers were then used for a validation set and an additional held-out test set. This additional test set has the same number of speakers as the validation set, but most papers just report results on the ZeroSpeech 2105 test set.

The result is the following split of Buckeye, according to speaker:

  • Train (English1 in my thesis, devpart1 in other repos): s02, s03, s04, s05, s06, s08, s10, s11, s12, s13, s16, s38.
  • Validation (devpart2 in other repos): s17, s18, s19, s22, s34, s37, s39, s40.
  • Test (English2 in my thesis, ZS in other repos): s01, s20, s23, s24, s25, s26, s27, s29, s30, s31, s32, s33.
  • Additional test: s07, s09, s14, s15, s21, s28, s35, s36.

I fist used this in (Kamper et al., 2017) and since then in a number of follow-up papers. Others have also used this split, e.g. (Drexler and Glass, 2017), (Bhati et al., 2021), and ([Peng and Harwath, 2022]https://arxiv.org/abs/2203.15081)).

Sets used in this repo. In this repo I only make use of the validation and test sets above, although features are extracted for the training set. See the experimental setup section of the paper.

The Kreuk split. Note that Kreuk et al. (2020) uses a different split which is also used by others. So in the section in the paper where I compare to their approach, I use their split:

  • Train: All Buckeye speakers not below.
  • Validation: s25, s36, s39, s40.
  • Test: s03, s07, s31, s34.

This split is not included in this repository---it made things too cluttered. And note that in the the paper I again don't use the Kreuk training set: I only report results on the test data when comparing to their models.

Reducing a codebook using clustering

If a codebook is very large, the codes could be reduced by clustering. The reduced codebook should be saved in a new model directory, and links to the original pre-quantized features should be created.

As an example, in cluster_codebook.ipynb, the ResDAVEnet-VQ codebook is loaded and reduced to 50 codes. The original codebook had 1024 codes, but only 498 of these were actually used; these are reduced to 50. The resulting codebook is saved to exp/resdavenet_vq_clust50/buckeye/embedding.npy. The pre-quantized features are linked to the original version in exp/resdavenet_vq/. The indices from the original model shouldn't be linked, since these doesn't match the new codebook (but an indices file isn't necessary for running many of the phone segmentation algorithms).

Old work-flow (deprecated)

  1. Extract CPC+K-means features in ../zerospeech2021_baseline/.
  2. Perform phone segmentation here using vq_phoneseg.py.
  3. Move to ../seg_aernn/notebooks/ and perform word segmentation.
  4. Move back here and evaluate the segmentation using eval_segmentation.py.
  5. For ZeroSpeech systems, the evaluation is done in ../zerospeech2017_eval/.

Disclaimer

The code provided here is not pretty. But research should be reproducible. I provide no guarantees with the code, but please let me know if you have any problems, find bugs or have general comments.

You might also like...
A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions
A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Overview This is a hobby project which includes a hand-gesture controlled virtual piano using an android phone camera and some OpenCV library. My moti

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

ADB IP ROTATE This an Python script based on Android Debug Bridge (adb) shell sc

Pytorch codes for
Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Self-Supervised-MVS This repository is the official PyTorch implementation of our AAAI 2021 paper: "Self-supervised Multi-view Stereo via Effective Co

pytorch implementation of
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation
Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

CorDA Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation Prerequisite Please create and activate the follo

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation This repository contains the official implementation of our paper: Self-su

Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

Self-supervised Multi-modal Hybrid Fusion Network for Brain Tumor Segmentation

JBHI-Pytorch This repository contains a reference implementation of the algorithms described in our paper "Self-supervised Multi-modal Hybrid Fusion N

Comments
  • Issues reproducing results on Buckeye

    Issues reproducing results on Buckeye

    Hi. I'm following the README directions closely but my ultimate results on the Buckeye validation set do not match was is reported here. They are as follows:

    Word boundaries:
    Precision: 24.10%
    Recall: 14.48%
    F-score: 18.09%
    OS: -39.93%
    R-value: 36.69%
    ---------------------------------------------------------------------------
    Word token boundaries:
    Precision: 18.75%
    Recall: 13.50%
    F-score: 15.70%
    OS: -28.02%
    

    There were a couple changes that had to be made to the codebase to get it to run so I wonder if one of these was breaking? Thanks in advance for your assistance!

    1. The provided link for the baseline CPC models is bad. I believe I found them here: https://github.com/zerospeech/zerospeech2021_baseline

    2. librosa is not installed by default and manually adding it usually results in an error since the librosa.output module is deprecated. In get_buckeye_wavs I replaced this with soundfile.write.

    3. It seems that vq_phoneseg.py, vq_wordseg.py, and eval_segmentations.py do not by default point to the correct data directories. These assume the exp directory is contained in vqwordseg; however, based on my usage it is actually contained in zerospeech2021_baseline. Also, there is additional subdirectory structure assumed in these scripts that is not by default included when the CPC embeddings are written. For example, the prequant subdirectory is something I had to manually add.

    4. When computing the distance between codebook entries and embeddings in algorithm.py the code throws an error stating the number of columns in the two arrays must match. I fixed the issues by transposing the embedding.

    opened by lstrgar 3
Releases(v1.0)
Owner
Herman Kamper
Herman Kamper
Code for the paper "Graph Attention Tracking". (CVPR2021)

SiamGAT 1. Environment setup This code has been tested on Ubuntu 16.04, Python 3.5, Pytorch 1.2.0, CUDA 9.0. Please install related libraries before r

122 Dec 24, 2022
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

212 Dec 25, 2022
MutualGuide is a compact object detector specially designed for embedded devices

Introduction MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two

ZHANG Heng 103 Dec 13, 2022
Mengzi Pretrained Models

中文 | English Mengzi 尽管预训练语言模型在 NLP 的各个领域里得到了广泛的应用,但是其高昂的时间和算力成本依然是一个亟需解决的问题。这要求我们在一定的算力约束下,研发出各项指标更优的模型。 我们的目标不是追求更大的模型规模,而是轻量级但更强大,同时对部署和工业落地更友好的模型。

Langboat 424 Jan 04, 2023
FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control by Dimitri von Rütte, Luca Biggio, Yannic Kilcher, Thomas Hofmann FIGARO: Generat

Dimitri 83 Jan 07, 2023
This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

BEAR Overview This repository contains code associated with the preprint A generative nonparametric Bayesian model for whole genomes (2021), which pro

Debora Marks Lab 10 Sep 18, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

43 Nov 21, 2022
Reinforcement Learning Theory Book (rus)

Reinforcement Learning Theory Book (rus)

qbrick 206 Nov 27, 2022
An implementation of EWC with PyTorch

EWC.pytorch An implementation of Elastic Weight Consolidation (EWC), proposed in James Kirkpatrick et al. Overcoming catastrophic forgetting in neural

Ryuichiro Hataya 166 Dec 22, 2022
This is a custom made virus code in python, using tkinter module.

skeleterrorBetaV0.1-Virus-code This is a custom made virus code in python, using tkinter module. This virus is not harmful to the computer, it only ma

AR 0 Nov 21, 2022
Learning based AI for playing multi-round Koi-Koi hanafuda card games. Have fun.

Koi-Koi AI Learning based AI for playing multi-round Koi-Koi hanafuda card games. Platform Python PyTorch PySimpleGUI (for the interface playing vs AI

Sanghai Guan 10 Nov 20, 2022
Generating Fractals on Starknet with Cairo

StarknetFractals Generating the mandelbrot set on Starknet Current Implementation generates 1 pixel of the fractal per call(). It takes a few minutes

Orland0x 10 Jul 16, 2022
Exploration-Exploitation Dilemma Solving Methods

Exploration-Exploitation Dilemma Solving Methods Medium article for this repo - HERE In ths repo I implemented two techniques for tackling mentioned t

Aman Mishra 6 Jan 25, 2022
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?

RaftMLP RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality? By Yuki Tatsunami and Masato Taki (Rikkyo University) [arxiv]

Okojo 20 Aug 31, 2022
PyTorch implementation of DCT fast weight RNNs

DCT based fast weights This repository contains the official code for the paper: Training and Generating Neural Networks in Compressed Weight Space. T

Kazuki Irie 4 Dec 24, 2022
A keras-based real-time model for medical image segmentation (CFPNet-M)

CFPNet-M: A Light-Weight Encoder-Decoder Based Network for Multimodal Biomedical Image Real-Time Segmentation This repository contains the implementat

268 Nov 27, 2022
一个多模态内容理解算法框架,其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

Overview 架构设计 插件介绍 安装使用 框架简介 方便使用,支持多模态,多任务的统一训练框架 能力列表: bert + 分类任务 自定义任务训练(插件注册) 框架设计 框架采用分层的思想组织模型训练流程。 DATA 层负责读取用户数据,根据 field 管理数据。 Parser 层负责转换原

Tencent 265 Dec 22, 2022
根据midi文件演奏“风物之诗琴”的脚本 "Windsong Lyre" auto play

Genshin-lyre-auto-play 简体中文 | English 简介 根据midi文件演奏“风物之诗琴”的脚本。由Python驱动,在此承诺, ⚠️ 项目内绝不含任何能够引起安全问题的代码。 前排提示:所有键盘在动但是原神没反应的都是因为没有管理员权限,双击run.bat或者以管理员模式

御坂17032号 386 Jan 01, 2023
Python version of the amazing Reaction Mechanism Generator (RMG).

Reaction Mechanism Generator (RMG) Description This repository contains the Python version of Reaction Mechanism Generator (RMG), a tool for automatic

Reaction Mechanism Generator 284 Dec 27, 2022
MiniSom is a minimalistic implementation of the Self Organizing Maps

MiniSom Self Organizing Maps MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial N

Giuseppe Vettigli 1.2k Jan 03, 2023