Efficient Training of Audio Transformers with Patchout

Overview

PaSST: Efficient Training of Audio Transformers with Patchout

This is the implementation for Efficient Training of Audio Transformers with Patchout

Patchout significantly reduces the training time and GPU memory requirements to train transformers on audio spectrograms, while improving their performance.

Patchout works by dropping out some of the input patches during training. In either a unstructured way (randomly, similar to dropout), or entire time-frames or frequency bins of the extracted patches (similar to SpecAugment), which corresponds to rows/columns in step 3 of the figure below.

PaSST architecture

Setting up the experiments environment

This repo uses forked versions of sacred for configuration and logging, and pytorch-lightning for training.

For setting up Mamba is recommended and faster then conda:

conda install mamba -n base -c conda-forge

Now you can import the environment from environment.yml

mamba env create -f environment.yml

Now you have an environment named ba3l. Now install the forked versions of sacred and pl-lightning and ba3l.

# dependencies
conda activate ba3l
pip install https://github.com/kkoutini/sacred/archive/ba3l.zip
pip install https://github.com/kkoutini/pytorch-lightning/archive/ba3l.zip
pip install https://github.com/kkoutini/ba3l/archive/master.zip

In order to check the environment we used in our runs, please check the environment.yml and pip_list.txt files. Which were exported using:

conda env export --no-builds | grep -v "prefix" > environment.yml
pip list > pip_list.txt

Getting started

Each dataset has an experiment file such as ex_audioset.py and ex_openmic.py and a dataset folder with a readme file. In general, you can prob the experiment file for help:

python ex_audioset.py help

you can override any of the configuration using the sacred syntax. In order to see the available options either use omniboard or use:

 python ex_audioset.py print_config

There are many pre-defined configuration options in config_updates.py. These include different models, setups etc... You can list these configurations with:

python ex_audioset.py print_named_configs

The overall configurations looks like this:

  ...
  seed = 542198583                  # the random seed for this experiment
  slurm_job_id = ''
  speed_test_batch_size = 100
  swa = True
  swa_epoch_start = 50
  swa_freq = 5
  use_mixup = True
  warm_up_len = 5
  weight_decay = 0.0001
  basedataset:
    base_dir = 'audioset_hdf5s/'     # base directory of the dataset, change it or make a link
    eval_hdf5 = 'audioset_hdf5s/mp3/eval_segments_mp3.hdf'
    wavmix = 1
    ....
    roll_conf:
      axis = 1
      shift = None
      shift_range = 50
  datasets:
    test:
      batch_size = 20
      dataset = {CMD!}'/basedataset.get_test_set'
      num_workers = 16
      validate = True
    training:
      batch_size = 12
      dataset = {CMD!}'/basedataset.get_full_training_set'
      num_workers = 16
      sampler = {CMD!}'/basedataset.get_ft_weighted_sampler'
      shuffle = None
      train = True
  models:
    mel:
      freqm = 48
      timem = 192
      hopsize = 320
      htk = False
      n_fft = 1024
      n_mels = 128
      norm = 1
      sr = 32000
      ...
    net:
      arch = 'passt_s_swa_p16_128_ap476'
      fstride = 10
      in_channels = 1
      input_fdim = 128
      input_tdim = 998
      n_classes = 527
      s_patchout_f = 4
      s_patchout_t = 40
      tstride = 10
      u_patchout = 0
      ...
  trainer:
    accelerator = None
    accumulate_grad_batches = 1
    amp_backend = 'native'
    amp_level = 'O2'
    auto_lr_find = False
    auto_scale_batch_size = False
    ...

There are many things that can be updated from the command line. In short:

  • All the configuration options under trainer are pytorch lightning trainer api. For example, to turn off cuda benchmarking add trainer.benchmark=False to the command line.
  • models.net are the PaSST (or the chosen NN) options.
  • models.mel are the preprocessing options (mel spectrograms).

Training on Audioset

Download and prepare the dataset as explained in the audioset page The base PaSST model can be trained for example like this:

python ex_audioset.py with trainer.precision=16  models.net.arch=passt_deit_bd_p16_384 -p -m mongodb_server:27000:audioset21_balanced -c "PaSST base"

For example using only unstructured patchout of 400:

python ex_audioset.py with trainer.precision=16  models.net.arch=passt_deit_bd_p16_384  models.net.u_patchout=400  models.net.s_patchout_f=0 models.net.s_patchout_t=0 -p -m mongodb_server:27000:audioset21_balanced -c "Unstructured PaSST base"

Multi-gpu training can be enabled by setting the environment variable DDP, for example with 2 gpus:

 DDP=2 python ex_audioset.py with trainer.precision=16  models.net.arch=passt_deit_bd_p16_384 -p -m mongodb_server:27000:audioset21_balanced -c "PaSST base 2 GPU"

Pre-trained models

Please check the releases page, to download pre-trained models. In general, you can get a pretrained model on Audioset using

from models.passt import get_model
model  = get_model(arch="passt_s_swa_p16_128_ap476", pretrained=True, n_classes=527, in_channels=1,
                   fstride=10, tstride=10,input_fdim=128, input_tdim=998,
                   u_patchout=0, s_patchout_t=40, s_patchout_f=4)

this will get automatically download pretrained PaSST on audioset with with mAP of 0.476. the model was trained with s_patchout_t=40, s_patchout_f=4 but you can change these to better fit your task/ computational needs.

There are several pretrained models availble with different strides (overlap) and with/without using SWA: passt_s_p16_s16_128_ap468, passt_s_swa_p16_s16_128_ap473, passt_s_swa_p16_s14_128_ap471, passt_s_p16_s14_128_ap469, passt_s_swa_p16_s12_128_ap473, passt_s_p16_s12_128_ap470. For example, In passt_s_swa_p16_s16_128_ap473: p16 mean patch size is 16x16, s16 means no overlap (stride=16), 128 mel bands, ap473 refers to the performance of this model on Audioset mAP=0.479.

In general, you can get a this pretrained model using:

from models.passt import get_model
passt = get_model(arch="passt_s_swa_p16_s16_128_ap473", fstride=16, tstride=16)

Using the framework, you can evaluate this model using:

python ex_audioset.py evaluate_only with passt_s_swa_p16_s16_128_ap473

Ensemble of these models are provided as well: A large ensemble giving mAP=.4956

python ex_audioset.py evaluate_only with  trainer.precision=16 ensemble_many

An ensemble of 2 models with stride=14 and stride=16 giving mAP=.4858

python ex_audioset.py evaluate_only with  trainer.precision=16 ensemble_s16_14

As well as other ensembles ensemble_4, ensemble_5

Contact

The repo will be updated, in the mean time if you have any questions or problems feel free to open an issue on GitHub, or contact the authors directly.

Comments
  • FSD50K - validating on eval data

    FSD50K - validating on eval data

    Hi! First off, excellent work with the module. It's showing great results so far in my project. I'm having trouble, however, with an experiment. I am trying to fine-tune and train the model on subsets (3k samples for training and validating) and have created hdf5 files for that. The paths in config.basedatasets are corrected for this.

    The problem that I run into is that when I run the command: python ex_fsd50k.py evaluate_only with passt_s_swa_p16_s16_128_ap473 the program uses the evaluation data for validation. I confirmed this by making a change in fsd50k/dataset.py:

    def __len__(self):
        if self.hdf5_file == "audioset_hdf5s/mp3/FSD50K.eval_mp3.hdf":
            return 300
        return self.length
    

    which affects the number of validation batches.

    I really don't understand what is going on. Isn't the model supposed to validate on the validation data?

    Kindest regards, Ludvig.

    opened by Ludvig-Joborn 5
  • No module named 'ba3l.ingredients'

    No module named 'ba3l.ingredients'

    hi, i want to train the PaSST with Audioset But when i runed "ex_audioset.py", i faced error: "No module named 'ba3l.ingredients" I already finished setting up the environment as follow the Readme how can i fix it

    opened by kimsojeong1225 5
  • RuntimeError: The size of tensor a (2055) must match the size of tensor b (99) at non-singleton dimension 3

    RuntimeError: The size of tensor a (2055) must match the size of tensor b (99) at non-singleton dimension 3

    I use a trained model for inference and I encounter this problem when the file length is long. Traceback (most recent call last): File "", line 1, in File "/home/xingyum/anaconda3/envs/ba3l/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/xingyum/models/PaSST/output/openmic2008/_None/checkpoints/src/hear21passt/hear21passt/wrapper.py", line 38, in forward x, features = self.net(specs) File "/home/xingyum/anaconda3/envs/ba3l/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/home/xingyum/models/PaSST/output/openmic2008/_None/checkpoints/src/hear21passt/hear21passt/models/passt.py", line 507, in forward x = self.forward_features(x) File "/home/xingyum/models/PaSST/output/openmic2008/_None/checkpoints/src/hear21passt/hear21passt/models/passt.py", line 454, in forward_features x = x + time_new_pos_embed RuntimeError: The size of tensor a (2055) must match the size of tensor b (99) at non-singleton dimension 3 image

    opened by 980202006 3
  • Changing tdim for pretrained model

    Changing tdim for pretrained model

    Thanks for sharing such great work! I want to use the pre-trained model but changing input_tdim is giving an error. My audio clips are relatively small and hence i need a smaller input_tdim. How do I do that? The error I get is due to the pretrained layer's size not equal to the current size of the model(After using input_tdim)

    opened by ranjith1604 3
  • Is it possible to install the passt with python=3.6?

    Is it possible to install the passt with python=3.6?

    Hi, thanks so much for sharing the great work! I'd like to use PaSST for downstream tasks and integrate it into existing conda environment with python=3.6 (it 's kind of painful to upgrade python from 3.6 to 3.7/3.8 due to many inconsistent packages). I know that python>=3.7 is required to install PaSST, but I'm wandering if it's possible to install it with python=3.6?

    opened by Alibabade 2
  • Inference ESC-50 fine-tuned model

    Inference ESC-50 fine-tuned model

    Hello, authors. Thank you for sharing the great work.

    I tried to fine-tuned AudioSet pretrained model passt-s-f128-p16-s10-ap.476-swa.pt on ESC-50 dataset by using ex_esc50.py. I got checkpoints saved in output/esc50/_None/checkpoints/epoch=4-step=2669.ckpt. I want to load the checkpoint and inference with audio file. I am trying to load the checkpoint model and tried to used passt_hear21 for inference but kinda lost track of the process.

    Could you please share how to inference with the saved checkpoints on audio file?

    opened by myatmyintzuthin 2
  • Could not solve for environment specs

    Could not solve for environment specs

    I clone the repo. As per the README:

    conda install mamba -n base -c conda-forge
    
    Collecting package metadata (current_repodata.json): done
    Solving environment: done
    
    ## Package Plan ##
    
      environment location: /opt/miniconda3
    
      added / updated specs:
        - mamba
    
    
    The following packages will be downloaded:
    
        package                    |            build
        ---------------------------|-----------------
        conda-22.11.1              |   py39h2804cbe_1         873 KB  conda-forge
        fmt-9.1.0                  |       hffc8910_0         171 KB  conda-forge
        krb5-1.20.1                |       h127bd45_0         1.0 MB  conda-forge
        libarchive-3.5.2           |       h69ec738_3         1.5 MB  conda-forge
        libcurl-7.87.0             |       hbe9bab4_0         304 KB  conda-forge
        libedit-3.1.20191231       |       hc8eb9b7_2          94 KB  conda-forge
        libev-4.33                 |       h642e427_1          98 KB  conda-forge
        libmamba-1.1.0             |       h1254013_2         1.0 MB  conda-forge
        libmambapy-1.1.0           |   py39h8f82c16_2         214 KB  conda-forge
        libnghttp2-1.47.0          |       h232270b_1         816 KB  conda-forge
        libsolv-0.7.23             |       hb5ab8b9_0         373 KB  conda-forge
        libssh2-1.10.0             |       hb80f160_3         218 KB  conda-forge
        libxml2-2.9.14             |       h9d8dfc2_4         656 KB  conda-forge
        lz4-c-1.9.3                |       hbdafb3b_1         147 KB  conda-forge
        lzo-2.10                   |    h642e427_1000         154 KB  conda-forge
        mamba-1.1.0                |   py39hde45b87_2          48 KB  conda-forge
        openssl-1.1.1s             |       h03a7124_1         1.5 MB  conda-forge
        pybind11-abi-4             |       hd8ed1ab_3          10 KB  conda-forge
        reproc-14.2.4              |       h1a8c8d9_0          27 KB  conda-forge
        reproc-cpp-14.2.4          |       hb7217d7_0          20 KB  conda-forge
        yaml-cpp-0.7.0             |       hb7217d7_2         133 KB  conda-forge
        ------------------------------------------------------------
                                               Total:         9.4 MB
    
    The following NEW packages will be INSTALLED:
    
      fmt                conda-forge/osx-arm64::fmt-9.1.0-hffc8910_0 
      icu                conda-forge/osx-arm64::icu-70.1-h6b3803e_0 
      krb5               conda-forge/osx-arm64::krb5-1.20.1-h127bd45_0 
      libarchive         conda-forge/osx-arm64::libarchive-3.5.2-h69ec738_3 
      libcurl            conda-forge/osx-arm64::libcurl-7.87.0-hbe9bab4_0 
      libedit            conda-forge/osx-arm64::libedit-3.1.20191231-hc8eb9b7_2 
      libev              conda-forge/osx-arm64::libev-4.33-h642e427_1 
      libiconv           conda-forge/osx-arm64::libiconv-1.17-he4db4b2_0 
      libmamba           conda-forge/osx-arm64::libmamba-1.1.0-h1254013_2 
      libmambapy         conda-forge/osx-arm64::libmambapy-1.1.0-py39h8f82c16_2 
      libnghttp2         conda-forge/osx-arm64::libnghttp2-1.47.0-h232270b_1 
      libsolv            conda-forge/osx-arm64::libsolv-0.7.23-hb5ab8b9_0 
      libssh2            conda-forge/osx-arm64::libssh2-1.10.0-hb80f160_3 
      libxml2            conda-forge/osx-arm64::libxml2-2.9.14-h9d8dfc2_4 
      lz4-c              conda-forge/osx-arm64::lz4-c-1.9.3-hbdafb3b_1 
      lzo                conda-forge/osx-arm64::lzo-2.10-h642e427_1000 
      mamba              conda-forge/osx-arm64::mamba-1.1.0-py39hde45b87_2 
      pybind11-abi       conda-forge/noarch::pybind11-abi-4-hd8ed1ab_3 
      reproc             conda-forge/osx-arm64::reproc-14.2.4-h1a8c8d9_0 
      reproc-cpp         conda-forge/osx-arm64::reproc-cpp-14.2.4-hb7217d7_0 
      yaml-cpp           conda-forge/osx-arm64::yaml-cpp-0.7.0-hb7217d7_2 
      zstd               conda-forge/osx-arm64::zstd-1.5.2-h8128057_4 
    
    The following packages will be UPDATED:
    
      ca-certificates    pkgs/main::ca-certificates-2022.10.11~ --> conda-forge::ca-certificates-2022.12.7-h4653dfc_0 
      libcxx                pkgs/main::libcxx-12.0.0-hf6beb65_1 --> conda-forge::libcxx-14.0.6-h2692d47_0 
      libzlib                                 1.2.12-ha287fd2_2 --> 1.2.13-h03a7124_4 
      openssl              pkgs/main::openssl-1.1.1s-h1a28f6b_0 --> conda-forge::openssl-1.1.1s-h03a7124_1 
      zlib                    pkgs/main::zlib-1.2.12-h5a0b063_2 --> conda-forge::zlib-1.2.13-h03a7124_4 
    
    The following packages will be SUPERSEDED by a higher-priority channel:
    
      certifi            pkgs/main/osx-arm64::certifi-2022.12.~ --> conda-forge/noarch::certifi-2022.12.7-pyhd8ed1ab_0 
      conda              pkgs/main::conda-22.11.1-py39hca03da5~ --> conda-forge::conda-22.11.1-py39h2804cbe_1 
    
    
    Proceed ([y]/n)? 
    
    
    Downloading and Extracting Packages
                                                                                                                                         
    Preparing transaction: done                                                                                                          
    Verifying transaction: done                                                                                                          
    Executing transaction: done                                                                                                          
    

    But then the next mambo command fails :\

    mamba env create -f environment.yml   
    

    with

    pkgs/r/osx-arm64                                              No change
    pkgs/main/osx-arm64                                           No change
    pkgs/main/noarch                                              No change
    pkgs/r/noarch                                                 No change
    conda-forge/osx-arm64                                4.7MB @ 351.1kB/s 13.6s
    conda-forge/noarch                                  10.7MB @ 566.8kB/s 19.2s
    
                                                                                                                                         
    Looking for: ['_libgcc_mutex==0.1=conda_forge', '_openmp_mutex==4.5=2_gnu', '_pytorch_select==0.1=cpu_0', 'appdirs==1.4.4=pyh9f0ad1d_0', 'audioread==2.1.9=py37h89c1867_4', 'blas==1.0=mkl', 'brotlipy==0.7.0=py37h5e8e339_1001', 'bzip2==1.0.8=h7f98852_4', 'c-ares==1.17.1=h7f98852_1', 'ca-certificates==2020.12.5=ha878542_0', 'cached-property==1.5.2=hd8ed1ab_1', 'cached_property==1.5.2=pyha770c72_1', 'certifi==2020.12.5=py37h89c1867_1', 'cffi==1.14.5=py37hc58025e_0', 'chardet==4.0.0=py37h89c1867_3', 'colorama==0.4.4=pyh9f0ad1d_0', 'cryptography==3.4.6=py37h5d9358c_0', 'cycler==0.10.0=py_2', 'decorator==4.4.2=py_0', 'docopt==0.6.2=py_1', 'ffmpeg==4.3.1=hca11adc_2', 'freetype==2.10.4=h0708190_1', 'gettext==0.19.8.1=h0b5b191_1005', 'gitdb==4.0.5=pyhd8ed1ab_1', 'gitpython==3.1.14=pyhd8ed1ab_0', 'gmp==6.2.1=h58526e2_0', 'gnutls==3.6.13=h85f3911_1', 'h5py==3.1.0=nompi_py37h1e651dc_100', 'hdf5==1.10.6=nompi_h6a2412b_1114', 'idna==2.10=pyh9f0ad1d_0', 'importlib-metadata==3.7.3=py37h89c1867_0', 'importlib_metadata==3.7.3=hd8ed1ab_0', 'intel-openmp==2020.2=254', 'joblib==1.0.1=pyhd8ed1ab_0', 'jpeg==9d=h36c2ea0_0', 'jsonpickle==1.4.1=pyh9f0ad1d_0', 'kiwisolver==1.3.1=py37h2527ec5_1', 'krb5==1.17.2=h926e7f8_0', 'lame==3.100=h7f98852_1001', 'lcms2==2.12=hddcbb42_0', 'ld_impl_linux-64==2.35.1=hea4e1c9_2', 'libblas==3.9.0=1_h86c2bf4_netlib', 'libcblas==3.9.0=5_h92ddd45_netlib', 'libcurl==7.75.0=hc4aaa36_0', 'libedit==3.1.20191231=he28a2e2_2', 'libev==4.33=h516909a_1', 'libffi==3.3=h58526e2_2', 'libflac==1.3.3=h9c3ff4c_1', 'libgcc-ng==9.3.0=h2828fa1_19', 'libgfortran-ng==9.3.0=hff62375_19', 'libgfortran5==9.3.0=hff62375_19', 'libgomp==9.3.0=h2828fa1_19', 'liblapack==3.9.0=5_h92ddd45_netlib', 'libllvm10==10.0.1=he513fc3_3', 'libnghttp2==1.43.0=h812cca2_0', 'libogg==1.3.4=h7f98852_1', 'libopenblas==0.3.12=pthreads_h4812303_1', 'libopus==1.3.1=h7f98852_1', 'libpng==1.6.37=h21135ba_2', 'librosa==0.8.0=pyh9f0ad1d_0', 'libsndfile==1.0.31=h9c3ff4c_1', 'libssh2==1.9.0=ha56f1ee_6', 'libstdcxx-ng==9.3.0=h6de172a_19', 'libtiff==4.2.0=hbd63e13_2', 'libvorbis==1.3.7=h9c3ff4c_0', 'libwebp-base==1.2.0=h7f98852_2', 'libzlib==1.2.11=h36c2ea0_1013', 'llvm-openmp==11.1.0=h4bd325d_1', 'llvmlite==0.36.0=py37h9d7f4d0_0', 'lz4-c==1.9.3=h9c3ff4c_1', 'matplotlib-base==3.3.4=py37h0c9df89_0', 'mkl==2020.2=256', 'mkl-service==2.3.0=py37h8f50634_2', 'munch==2.5.0=py_0', 'ncurses==6.2=h58526e2_4', 'nettle==3.6=he412f7d_0', 'ninja==1.10.2=h4bd325d_0', 'numba==0.53.0=py37h7dd73a4_1', 'numpy==1.20.1=py37haa41c4c_0', 'olefile==0.46=pyh9f0ad1d_1', 'openblas==0.3.12=pthreads_h04b7a96_1', 'openh264==2.1.1=h780b84a_0', 'openjpeg==2.4.0=hb52868f_1', 'openssl==1.1.1k=h7f98852_0', 'packaging==20.9=pyh44b312d_0', 'pandas==1.2.3=py37hdc94413_0', 'pillow==8.1.2=py37h4600e1f_1', 'pip==21.0.1=pyhd8ed1ab_0', 'pooch==1.3.0=pyhd8ed1ab_0', 'py-cpuinfo==7.0.0=pyh9f0ad1d_0', 'pycparser==2.20=pyh9f0ad1d_2', 'pyopenssl==20.0.1=pyhd8ed1ab_0', 'pyparsing==2.4.7=pyhd8ed1ab_1', 'pysocks==1.7.1=py37h89c1867_5', 'pysoundfile==0.10.3.post1=pyhd3deb0d_0', 'python==3.7.10=hffdb5ce_100_cpython', 'python-dateutil==2.8.1=py_0', 'python_abi==3.7=3_cp37m', 'pytz==2021.1=pyhd8ed1ab_0', 'readline==8.0=he28a2e2_2', 'requests==2.25.1=pyhd3deb0d_0', 'resampy==0.2.2=py_0', 'scikit-learn==0.24.1=py37h69acf81_0', 'scipy==1.6.1=py37h14a347d_0', 'setuptools==49.6.0=py37h89c1867_3', 'six==1.15.0=pyh9f0ad1d_0', 'smmap==3.0.5=pyh44b312d_0', 'sqlite==3.34.0=h74cdb3f_0', 'threadpoolctl==2.1.0=pyh5ca1d4c_0', 'tk==8.6.10=h21135ba_1', 'tornado==6.1=py37h5e8e339_1', 'typing_extensions==3.7.4.3=py_0', 'urllib3==1.26.4=pyhd8ed1ab_0', 'wrapt==1.12.1=py37h5e8e339_3', 'x264==1!161.3030=h7f98852_1', 'xz==5.2.5=h516909a_1', 'zipp==3.4.1=pyhd8ed1ab_0', 'zlib==1.2.11=h36c2ea0_1013', 'zstd==1.4.9=ha95c52a_0']
    
    
    Could not solve for environment specs
    Encountered problems while solving:
      - nothing provides requested _libgcc_mutex ==0.1 conda_forge
      - nothing provides requested _openmp_mutex ==4.5 2_gnu
      - nothing provides requested audioread ==2.1.9 py37h89c1867_4
      - nothing provides requested blas ==1.0 mkl
      - nothing provides requested brotlipy ==0.7.0 py37h5e8e339_1001
      - nothing provides requested bzip2 ==1.0.8 h7f98852_4
      - nothing provides requested c-ares ==1.17.1 h7f98852_1
      - nothing provides requested ca-certificates ==2020.12.5 ha878542_0
      - nothing provides requested certifi ==2020.12.5 py37h89c1867_1
      - nothing provides requested cffi ==1.14.5 py37hc58025e_0
      - nothing provides requested chardet ==4.0.0 py37h89c1867_3
      - nothing provides requested cryptography ==3.4.6 py37h5d9358c_0
      - nothing provides requested ffmpeg ==4.3.1 hca11adc_2
      - nothing provides requested freetype ==2.10.4 h0708190_1
      - nothing provides requested gettext ==0.19.8.1 h0b5b191_1005
      - nothing provides requested gmp ==6.2.1 h58526e2_0
      - nothing provides requested gnutls ==3.6.13 h85f3911_1
      - nothing provides requested h5py ==3.1.0 nompi_py37h1e651dc_100
      - nothing provides requested hdf5 ==1.10.6 nompi_h6a2412b_1114
      - nothing provides requested importlib-metadata ==3.7.3 py37h89c1867_0
      - nothing provides requested intel-openmp ==2020.2 254
      - nothing provides requested jpeg ==9d h36c2ea0_0
      - nothing provides requested kiwisolver ==1.3.1 py37h2527ec5_1
      - nothing provides requested krb5 ==1.17.2 h926e7f8_0
      - nothing provides requested lame ==3.100 h7f98852_1001
      - nothing provides requested lcms2 ==2.12 hddcbb42_0
      - nothing provides requested ld_impl_linux-64 ==2.35.1 hea4e1c9_2
      - nothing provides requested libblas ==3.9.0 1_h86c2bf4_netlib
      - nothing provides requested libcblas ==3.9.0 5_h92ddd45_netlib
      - nothing provides requested libcurl ==7.75.0 hc4aaa36_0
      - nothing provides requested libedit ==3.1.20191231 he28a2e2_2
      - nothing provides requested libev ==4.33 h516909a_1
      - nothing provides requested libffi ==3.3 h58526e2_2
      - nothing provides requested libflac ==1.3.3 h9c3ff4c_1
      - nothing provides requested libgcc-ng ==9.3.0 h2828fa1_19
      - nothing provides requested libgfortran-ng ==9.3.0 hff62375_19
      - nothing provides requested libgfortran5 ==9.3.0 hff62375_19
      - nothing provides requested libgomp ==9.3.0 h2828fa1_19
      - nothing provides requested liblapack ==3.9.0 5_h92ddd45_netlib
      - nothing provides requested libllvm10 ==10.0.1 he513fc3_3
      - nothing provides requested libnghttp2 ==1.43.0 h812cca2_0
      - nothing provides requested libogg ==1.3.4 h7f98852_1
      - nothing provides requested libopenblas ==0.3.12 pthreads_h4812303_1
      - nothing provides requested libopus ==1.3.1 h7f98852_1
      - nothing provides requested libpng ==1.6.37 h21135ba_2
      - nothing provides requested libsndfile ==1.0.31 h9c3ff4c_1
      - nothing provides requested libssh2 ==1.9.0 ha56f1ee_6
      - nothing provides requested libstdcxx-ng ==9.3.0 h6de172a_19
      - nothing provides requested libtiff ==4.2.0 hbd63e13_2
      - nothing provides requested libvorbis ==1.3.7 h9c3ff4c_0
      - nothing provides requested libwebp-base ==1.2.0 h7f98852_2
      - nothing provides requested libzlib ==1.2.11 h36c2ea0_1013
      - nothing provides requested llvm-openmp ==11.1.0 h4bd325d_1
      - nothing provides requested llvmlite ==0.36.0 py37h9d7f4d0_0
      - nothing provides requested lz4-c ==1.9.3 h9c3ff4c_1
      - nothing provides requested matplotlib-base ==3.3.4 py37h0c9df89_0
      - nothing provides requested mkl ==2020.2 256
      - nothing provides requested mkl-service ==2.3.0 py37h8f50634_2
      - nothing provides requested ncurses ==6.2 h58526e2_4
      - nothing provides requested nettle ==3.6 he412f7d_0
      - nothing provides requested ninja ==1.10.2 h4bd325d_0
      - nothing provides requested numba ==0.53.0 py37h7dd73a4_1
      - nothing provides requested numpy ==1.20.1 py37haa41c4c_0
      - nothing provides requested openblas ==0.3.12 pthreads_h04b7a96_1
      - nothing provides requested openh264 ==2.1.1 h780b84a_0
      - nothing provides requested openjpeg ==2.4.0 hb52868f_1
      - nothing provides requested openssl ==1.1.1k h7f98852_0
      - nothing provides requested pandas ==1.2.3 py37hdc94413_0
      - nothing provides requested pillow ==8.1.2 py37h4600e1f_1
      - nothing provides requested pysocks ==1.7.1 py37h89c1867_5
      - nothing provides requested python ==3.7.10 hffdb5ce_100_cpython
      - nothing provides requested readline ==8.0 he28a2e2_2
      - nothing provides requested scikit-learn ==0.24.1 py37h69acf81_0
      - nothing provides requested scipy ==1.6.1 py37h14a347d_0
      - nothing provides requested setuptools ==49.6.0 py37h89c1867_3
      - nothing provides requested sqlite ==3.34.0 h74cdb3f_0
      - nothing provides requested tk ==8.6.10 h21135ba_1
      - nothing provides requested tornado ==6.1 py37h5e8e339_1
      - nothing provides requested wrapt ==1.12.1 py37h5e8e339_3
      - nothing provides requested x264 ==1!161.3030 h7f98852_1
      - nothing provides requested xz ==5.2.5 h516909a_1
      - nothing provides requested zlib ==1.2.11 h36c2ea0_1013
      - nothing provides requested zstd ==1.4.9 ha95c52a_0
      - package pytz-2021.1-pyhd8ed1ab_0 requires python >=3, but none of the providers can be installed
    
    The environment can't be solved, aborting the operation
    

    This is on an OSX Apple Silicon machine

    opened by turian 2
  • ImportError: cannot import name 'F1' from 'torchmetrics' (/app/anaconda3/lib/python3.7/site-packages/torchmetrics/__init__.py)

    ImportError: cannot import name 'F1' from 'torchmetrics' (/app/anaconda3/lib/python3.7/site-packages/torchmetrics/__init__.py)

    python ex_openmic.py Traceback (most recent call last): File "ex_openmic.py", line 5, in from pytorch_lightning.callbacks import ModelCheckpoint File "/root/work_project_2021/project_music2video/PaSST/src/pytorch-lightning/pytorch_lightning/init.py", line 65, in from pytorch_lightning import metrics File "/root/work_project_2021/project_music2video/PaSST/src/pytorch-lightning/pytorch_lightning/metrics/init.py", line 16, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/root/work_project_2021/project_music2video/PaSST/src/pytorch-lightning/pytorch_lightning/metrics/classification/init.py", line 19, in from pytorch_lightning.metrics.classification.f_beta import F1, FBeta # noqa: F401 File "/root/work_project_2021/project_music2video/PaSST/src/pytorch-lightning/pytorch_lightning/metrics/classification/f_beta.py", line 16, in from torchmetrics import F1 as _F1 ImportError: cannot import name 'F1' from 'torchmetrics' (/app/anaconda3/lib/python3.7/site-packages/torchmetrics/init.py)

    envs: Name: torch Version: 1.12.1 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: [email protected] License: BSD-3 Location: /app/anaconda3/lib/python3.7/site-packages Requires: typing-extensions Required-by: torchvision, torchmetrics, torchaudio, timm, test-tube, Ba3l, pytorch-lightning

    opened by aiXia121 1
  • The loop in the diagram

    The loop in the diagram

    This is an amazing job! But I have a question: what does the loop in the diagram mean? In fact, I didn't find the loop operation in the paper and codes. Thanks!

    opened by YangYangTaoTao 1
  • Installation issues

    Installation issues

    Hi, I am trying to install and run the PaSST-S method on my own data but I get this error when I run python ex_audioset.py help

    File "ex_audioset.py", line 16, in <module>
        from helpers.mixup import my_mixup
    ModuleNotFoundError: No module named 'helpers.mixup'
    
    opened by p4vlos 1
  • mismatch version of pytorch-lighting and sarced

    mismatch version of pytorch-lighting and sarced

    Hello,when I after running the code following, 微信图片_20220426144143 and I run the code
    微信图片_20220426144415 but I encontered the issue: 微信图片_20220426144634 微信图片_20220426144639 Is this the wrong version of pytorch-lighting and sarced?When I upgrad the pytorch-lighting to the latest version,the issue is solved but the issue with sacred has not been solved. Could you please provide some help?Thank you very much!

    opened by Junglesl 15
Releases(v.0.0.7-audioset)
Learning nonlinear operators via DeepONet

DeepONet: Learning nonlinear operators The source code for the paper Learning nonlinear operators via DeepONet based on the universal approximation th

Lu Lu 239 Jan 02, 2023
Source code for our paper "Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures"

Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Code for the Multiplex Molecular Graph Neural Network (M

shzhang 59 Dec 10, 2022
Node Editor Plug for Blender

NodeEditor Blender的程序化建模插件 Show Current 基本框架:自定义的tree-node-socket、tree中的node与socket采用字典查询、基于socket入度的拓扑排序 数据传递和处理依靠Tree中的字典,socket传递字典key TODO 增加更多的节点

Cuimi 11 Dec 03, 2022
DIT is a DTLS MitM proxy implemented in Python 3. It can intercept, manipulate and suppress datagrams between two DTLS endpoints and supports psk-based and certificate-based authentication schemes (RSA + ECC).

DIT - DTLS Interception Tool DIT is a MitM proxy tool to intercept DTLS traffic. It can intercept, manipulate and/or suppress DTLS datagrams between t

52 Nov 30, 2022
Physics-informed Neural Operator for Learning Partial Differential Equation

PINO Physics-informed Neural Operator for Learning Partial Differential Equation Abstract: Machine learning methods have recently shown promise in sol

107 Jan 02, 2023
Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

DocEnTR Description Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer. This model is implemented on to

Mohamed Ali Souibgui 74 Jan 07, 2023
Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

EMI-FGSM This repository contains code to reproduce results from the paper: Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021) Xiaosen Wa

John Hopcroft Lab at HUST 10 Sep 26, 2022
Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021) Introduction This is the official repository for the PyTorch implementation

165 Dec 07, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Interpretation of T cell states using reference single-cell atlases

Interpretation of T cell states using reference single-cell atlases ProjecTILs is a computational method to project scRNA-seq data into reference sing

Cancer Systems Immunology Lab 139 Jan 03, 2023
A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

Justin 1.1k Dec 24, 2022
Meli Data Challenge 2021 - First Place Solution

My solution for the Meli Data Challenge 2021

Matias Moreyra 23 Mar 09, 2022
Official implementation of Sparse Transformer-based Action Recognition

STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro

Chonghan_Lee 15 Nov 02, 2022
Check out the StyleGAN repo and place it in the same directory hierarchy as the present repo

Variational Model Inversion Attacks Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani Most commands are in run_scripts. W

Jackson Wang 15 Dec 26, 2022
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Project Page | Paper A Shading-Guided Generative Implicit Model

Xingang Pan 115 Dec 18, 2022
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Prompt-Tuning Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning" Currently, we support the following huggigface models: Bart

Andrew Zeng 36 Dec 19, 2022
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloš Stanojević 11 Oct 14, 2022
Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

ML Privacy Meter Machine learning is playing a central role in automated decision making in a wide range of organization and service providers. The da

Data Privacy and Trustworthy Machine Learning Research Lab 357 Jan 06, 2023
[ICCV 2021 Oral] Deep Evidential Action Recognition

DEAR (Deep Evidential Action Recognition) Project | Paper & Supp Wentao Bao, Qi Yu, Yu Kong International Conference on Computer Vision (ICCV Oral), 2

Wentao Bao 80 Jan 03, 2023
Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

MS-SVConv : 3D Point Cloud Registration with Multi-Scale Architecture and Self-supervised Fine-tuning Compute features for 3D point cloud registration

42 Jul 25, 2022