Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

 Abeßer, J. & Müller, M. Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning, submitted to: ICASSP 2022

Related Work

  • we use pre-computed features & model architecture used in 3 previous papers
    • these are all unsupervised domain adaptation methods
    Mezza, A. I., Habets, E. A. P., Müller, M., & Sarti, A. (2021).
    #Unsupervised domain adaptation for acoustic scene classification
    using band-wise statistics matching. Proceedings of the European
    Signal Processing Conference (EUSIPCO), 11–15.
    https://doi.org/10.23919/Eusipco47968.2020.9287533"

    Drossos, K., Magron, P., & Virtanen, T. (2019). Unsupervised Adversarial Domain Adaptation based
    on the Wasserstein Distance for Acoustic Scene Classification. Proceedings of the IEEE Workshop
    on Applications of Signal Processing to Audio and Acoustics (WASPAA), 259–263. New Paltz, NY, USA.

    Gharib, S., Drossos, K., Emre, C., Serdyuk, D., & Virtanen, T. (2018). Unsupervised Adversarial Domain
    Adaptation for Acoustic Scene Classification. Proceedings of the Detection and Classification of
    Acoustic Scenes and Events (DCASE). Surrey, UK.

Files

  • configs.py - Training configurations (C0 ... C3M)
  • generator.py - Data generator
  • losses.py - Loss implementations
  • model.py - Function to create dual-input / dual-output model
  • model_kaggle.py - reference CNN model from related work for acoustic scene classification (ASC)
  • normalization.py - Normalization methods (see Mezza et al. above)
  • params.py - General parameters
  • prediction.py - Prediction script to evaluate models on test data
  • training.py - Script to run the model training for 6 different configurations (see Fig. 2 in the paper)

How to run

  • create python environment (e.g. with conda), the following versions were used during the paper preparation process
    • librosa==0.8.0
    • matplotlib==3.3.2
    • numpy=1.19.2
    • python=3.7.0
    • scikit-learn==0.23.2
    • tensorflow==2.3.0
    • torch==1.9.0
  • set in params.py the following variables
  • run python training.py && python prediction.py on a GPU device to train & evaluate the models
Owner
Jakob Abeßer
Passionate bass guitar player and percussionist. Senior Scientist at Fraunhofer IDMT. PhD in Music Information Retrieval.
Jakob Abeßer
A lossless neural compression framework built on top of JAX.

Kompressor Branch CI Coverage main (active) main development A neural compression framework built on top of JAX. Install setup.py assumes a compatible

Rosalind Franklin Institute 2 Mar 14, 2022
Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

EMI-FGSM This repository contains code to reproduce results from the paper: Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021) Xiaosen Wa

John Hopcroft Lab at HUST 10 Sep 26, 2022
Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende

Sanjana Gunna 7 Aug 07, 2022
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Mindee 1.5k Jan 01, 2023
Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data

This is the XLM-T repository, which includes data, code and pre-trained multilingual language models for Twitter. XLM-T - A Multilingual Language Mode

Cardiff NLP 112 Dec 27, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"

README The code is based on the ILswiss. To run the code, use python run_experiment.py --nosrun -e your YAML file -g gpu id Generally, run_experim

ApexRL 12 Mar 19, 2022
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
The codebase for Data-driven general-purpose voice activity detection.

Data driven GPVAD Repository for the work in TASLP 2021 Voice activity detection in the wild: A data-driven approach using teacher-student training. S

Heinrich Dinkel 75 Nov 27, 2022
Reference models and tools for Cloud TPUs.

Cloud TPUs This repository is a collection of reference models and tools used with Cloud TPUs. The fastest way to get started training a model on a Cl

5k Jan 05, 2023
Simple Dynamic Batching Inference

Simple Dynamic Batching Inference 解决了什么问题? 众所周知,Batch对于GPU上深度学习模型的运行效率影响很大。。。 是在Inference时。搜索、推荐等场景自带比较大的batch,问题不大。但更多场景面临的往往是稀碎的请求(比如图片服务里一次一张图)。 如果

116 Jan 01, 2023
BMVC 2021: This is the github repository for "Few Shot Temporal Action Localization using Query Adaptive Transformers" accepted in British Machine Vision Conference (BMVC) 2021, Virtual

FS-QAT: Few Shot Temporal Action Localization using Query Adaptive Transformer Accepted as Poster in BMVC 2021 This is an official implementation in P

Sauradip Nag 14 Dec 09, 2022
The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban

Guochen Yu 36 Dec 02, 2022
113 Nov 28, 2022
Deep High-Resolution Representation Learning for Human Pose Estimation

Deep High-Resolution Representation Learning for Human Pose Estimation (accepted to CVPR2019) News If you are interested in internship or research pos

HRNet 167 Dec 27, 2022
An AFL implementation with UnTracer (our coverage-guided tracer)

UnTracer-AFL This repository contains an implementation of our prototype coverage-guided tracing framework UnTracer in the popular coverage-guided fuz

113 Dec 17, 2022
School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

Thyrix 5 Nov 23, 2022
Jupyter notebooks for the code samples of the book "Deep Learning with Python"

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

François Chollet 16.2k Dec 30, 2022
A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

Florin Gogianu 98 Sep 20, 2022