L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
Python bindings for BigML.io

BigML Python Bindings BigML makes machine learning easy by taking care of the details required to add data-driven decisions and predictive power to yo

BigML Inc, Machine Learning made easy 271 Dec 27, 2022
Python wrapper for the Intercom API.

python-intercom Not officially supported Please note that this is NOT an official Intercom SDK. The third party that maintained it reached out to us t

Intercom 215 Dec 22, 2022
Ig-Crackv2 - Crack Instagram Version 2.9

★★ Information ★★ ★★Menu Special Crack Melalui Pengikut Crack Melalui Mengikuti

Risky [ Zero Tow ] 11 Aug 30, 2022
This discord bot preview user 42intra login picture.

42intra_Pic BOT This discord bot preview user 42intra login picture. created by: @YOPI#8626 Using: Python 3.9 (64-bit) (You don't need 3.9 but some fu

Zakaria Yacoubi 7 Mar 22, 2022
A tool written in Python used to instalock agents in VALORANT using the local API.

Valorant Instalock Tool v2.1.0 by Mr. SOSA A tool written in Python used to instalock agents in VALORANT using the local API. This is NOT a hotkey pro

Mr. SOSA 3 Nov 18, 2021
Bot interpretation of the carbon.now.sh site

📒 Source code of the @PicodeBot 🧸 Developer: @hoosnick Run $ git clone https://github.com/hoosnick/picodebot.git $ pip install -r requirements.txt P

Husniddin Murodov 13 Oct 02, 2022
A little proxy tool based on Tencent Cloud Function Service.

SCFProxy 一个基于腾讯云函数服务的免费代理池。 安装 python3 -m venv .venv source .venv/bin/activate pip3 install -r requirements.txt 项目配置 函数配置 开通腾讯云函数服务 在 函数服务 新建 中使用自定义

Mio 716 Dec 26, 2022
AWS Workmail Migration Tool

WMigrate A tool for migrating AWS Workmail Users and Groups cross region and cross accounts. It also creates user and group aliases and adds the users

NK 1 Oct 27, 2021
Minecraft name sniper written in python.

⚠️ IMPORTANT ⚠️ DO NOT USE MCSNIPERPY -- READ BELOW This sniper does not support Microsoft accounts or prename / gc sniping and is MUCH harder to use

MCsniperPY 201 Dec 30, 2022
A library that revolutionizes the way people interact with NextDNS.

NextDNS-API An awesome way to interface with your NextDNS account - via Python! Explore the docs » Report Bug . Request Feature Table Of Contents Abou

34 Dec 07, 2022
Best Buy purchase bot

B3 Best-Buy-Bot. Written in Python NOTICE: Don't be a disgrace to society. Don't use this for any mass buying/reselling purposes. About B3 is a bot th

Dogey11 8 Aug 15, 2022
Discord Bot that can translate your text, count and reply to your messages with a personalised text

Discord Bot that can translate your text, count and reply to your messages with a personalised text

Grizz 2 Jan 26, 2022
Python CMR is an easy to use wrapper to the NASA EOSDIS Common Metadata Repository API.

This repository is a copy of jddeal/python_cmr which is no longer maintained. It has been copied here with the permission of the original author for t

NASA 9 Nov 16, 2022
Python library for RetroMMO related stuff, including API wrapper

python library for RetroMMO related stuff, including API wrapper.

1 Nov 25, 2021
A discord account nuker with lots of tools that will destroy a discord account

A discord account nuker with lots of tools that will destroy a discord account (token destroyer... and much more).

firexi 10 Apr 28, 2022
Python: Asynchronous client for the Tailscale API

Python: Asynchronous client for the Tailscale API Asynchronous client for the Tailscale API. About This package allows you to control and monitor Tail

Franck Nijhof 9 Nov 22, 2022
Python wrapper to simplify calls to AncestryDNA API.

AncestryDNA API wrapper Ancestry exposes an undocumented REST API for its DNA features. This Python wrapper inventories the available calls, and expos

Matt 2 Jun 10, 2022
Unfollows Users You're Following

Github-Unfollow-Bot Info It unfollows users you're following, it runs in the background so you can still do what you do without it bothering you. It's

ExT 4 Sep 03, 2022
Gets instagram public username and returns usefull informations like profilepic(b64), video_urls etc.

InstaSucker Gets instagram public username and returns usefull informations like profilepic(b64), video_urls etc. Information this project contains a

Armin Amiri 5 Apr 30, 2022
The unofficial Amazon search CLI & Python API

amzSear The unofficial Amazon Product CLI & API. Easily search the amazon product directory from the command line without the need for an Amazon API k

Asher Silvers 95 Nov 11, 2022