L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
HackerNews and Reddit in one placce

EDIT: this project is 3.5 years old. I found it sad it's just laying around, so I did some minimal fixes and deployed it. Hope you enjoy! (PR's welcom

Hugo Montenegro 1 Nov 13, 2021
Bot made with Microsoft Azure' cloud service

IttenWearBot Autori: Antonio Zizzari Simone Giglio IttenWearBot è un bot intelligente dotato di sofisticate tecniche di machile learning che aiuta gli

Antonio Zizzari 1 Jan 24, 2022
A Discord bot that enables using breakout rooms on a server

Discord Breakout Room Bot This bot enables you to use breakout rooms on your Discord server! Note This bot was thrown together within a few hours, so

Till Müller 2 Nov 23, 2021
Lamblayer: a minimal deployment tool for AWS Lambda layers

lamblayer lamblayer is a minimal deployment tool for AWS Lambda layers. lamblayer does, Create a Layers of built pip-installable python packages. Crea

Yusuke Takahashi 2 Aug 19, 2022
Read manga from your favourites websites on telegram.

tg-manga-bot Read manga from your favourites websites on telegram. Current Development Bot @idkpythonbot Telegram Channel tg_manga_bot Commands start

Daniel Rivero 41 Dec 22, 2022
Example app to be deployed to AWS as an API Gateway / Lambda Stack

Disclaimer I won't answer issues or emails regarding the project anymore. The project is old and not maintained anymore. I'm not sure if it still work

Ben 123 Jan 01, 2023
PTV is a useful widget for trading view for doing paper trading when bar reply is enabled

PTV is a useful widget for trading view for doing paper trading when bar reply is enabled.(this feature did not implement in trading view)

Ali Moradi 39 Dec 26, 2022
A auto clock-in script based on python3 for BJUTer.

Introduction A auto clock-in script based on python3 for BJUTer. It could clock in at 9:00 a.m everyday. The script is inspired by tsosunchia What can

X 7 Nov 15, 2022
An script where it logs in your instagram account and follows people and likes their posts

InstaFollower An script where it logs in your instagram account and follows people and likes their posts (uses the tags to fetch people) Requirements:

Bless 3 Nov 29, 2022
Automate saving your Discover Weekly Playlist using Python.

SpotWeekly Automate saving your Discover Weekly Playlist using Python. Made with 3 and FastAPI. The saved playlist link is sent to my discord server

shourya 6 Jan 03, 2022
Advance Anonymous Sender bot with Caption Editor

AnonyMous Sender 👨‍💻 Advanced Anonymous Sender with Caption Editor Join @DaisySupport_Official 🎵 for help Features Get forwarded messages without f

Inuka Asith 13 Oct 09, 2022
Telegram Group Manager Bot + Userbot Written In Python Using Pyrogram.

Telegram Group Manager Bot + Userbot Written In Python Using PyrogramTelegram Group Manager Bot + Userbot Written In Python Using Pyrogram

1 Nov 11, 2021
DeFi wallet on Chia Network.

DeFi wallet on Chia Network.

GobyWallet 21 Aug 12, 2022
BaiduPCS API & App 百度网盘客户端

BaiduPCS-Py A BaiduPCS API and An App BaiduPCS-Py 是百度网盘 pcs 的非官方 api 和一个命令行运用程序。

Peter Ding 450 Jan 05, 2023
A Sublime Text plugin that displays inline images for single-line comments formatted like `// ![](example.png)`.

Inline Images Sometimes ASCII art is not enough. Sometimes an image says more than a thousand words. This Sublime Text plugin can display images inlin

Andreas Haferburg 8 Jul 01, 2022
A Python script that exports users from one Telegram group to another using one or more concurrent user bots.

ExportTelegramUsers A Python script that exports users from one Telegram group to another using one or more concurrent user bots. Make sure to set all

Fasil Minale 17 Jun 26, 2022
A discord bot providing notifications of player activity on a minecraft server.

tos-alert A discord bot providing notifications of player activity on a minecraft server. Setup By default the app does not launch and will crash with

1 Jul 22, 2022
⚡️ Get notified as soon as your next CPU, GPU, or game console is in stock

Inventory Hunter This bot helped me snag an RTX 3070... hopefully it will help you get your hands on your next CPU, GPU, or game console. Requirements

Eric Marti 1.1k Dec 26, 2022
Trust-minimized Bitcoin wallet

coldcore Trust-minimized, airgapped Bitcoin management This is experimental software. Wait for a formal release before use with real funds. A trust-mi

James O'Beirne 121 Jan 01, 2023
The most expensive version of Conway's Game of Life - running on the Ethereum Blockchain

GameOfLife The most expensive implementation of Conway's Game of Life ever - over $2,000 per step! (Probably the slowest too!) Conway's Game of Life r

75 Nov 26, 2022