L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
Automate UCheck COVID-19 self-assessment form submission

ucheck Automate UCheck COVID-19 self-assessment form submission. Disclaimer ucheck automatically completes the University of Tornto's UCheck COVID-19

Ira Horecka 15 Nov 30, 2022
A cross-platform script to book first available time for getting a passport in Sweden - Ett skript som automatiskt bokar pass hos polisen

Automatic passport booker - Boka pass automatiskt hos Svenska polisen A cross-platform script to book first available time for getting a passport in S

Elias Floreteng 14 Oct 17, 2022
A modular Telegram Python bot running on python3 with a sqlalchemy, redislab, mongo database, telethon, and pyrogram.

Zeldris Robot A modular Telegram Python bot running on python3 with a sqlalchemy, redislab, mongo database, telethon, and pyrogram. How to set up/depl

IDNCoderX 42 Dec 21, 2022
EC2 that automatically move files received through FTP to S3

ftp-ec2-s3-cf EC2 that automatically move files received through FTP to S3 Installation CloudFormation template Deploy now! Usage IP / domain name: ta

Javier Santana 1 Jun 19, 2021
This bot will delete messages containing blacklisted words in your telegram groups.

Profanity Detector Bot This bot will delete messages containing blacklisted words in your telegram groups. Made using ProfanityDetector.

Aditya 17 Oct 08, 2022
A modular telegram Python bot running on python3 with an sqlalchemy database.

TG_Bot A modular telegram Python bot running on python3 with an sqlalchemy database. Originally a simple group management bot with multiple admin feat

Movindu Bandara 1 Nov 02, 2021
Pycord, a maintained fork of discord.py, is a python wrapper for the Discord API

pycord A fork of discord.py. PyCord is a modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python. Key Features Mo

Pycord Development 2.3k Dec 31, 2022
Want to get your driver's license? Can't get a appointment because of COVID? Well I got a solution for you.

NJDMV-appoitment-alert Want to get your driver's license? Can't get a appointment because of COVID? Well I got a solution for you. We'll get you one i

Harris Spahic 3 Feb 04, 2022
A httpx token generator for discord [ hcaptcha bypass ]

Discord-Token-Generator-Yazato A httpx token generator for discord This generator was developed by Aced#0001, Dreamy Tos Follower#0001, Scripted#0131

23 Oct 26, 2021
Wats2PDF - Convert whatsapp exported chat(without media) into a readable pdf format

Wats2PDF convert whatsApp exported chat into a readable pdf format. convert with

5 Apr 26, 2022
Pogodasbot - Telegram bot sending channel weather info

Pogodasbot - Telegram bot sending channel weather info

Qayrat Sultan 1 Dec 15, 2022
FTP Anonymous Login

FTPAnon FTP Anonymous Login Install git clone https://github.com/SiThuTuntimehacker/FTPAnon cd FTPAnon bash install.sh access ftp sever " ftpaccess.tx

SiThuTun 3 Mar 23, 2022
❤️A next gen powerful telegram group manager bot for manage your groups and have fun with other cool modules

Natsuki Based on Python Telegram Bot Contributors Video Tutorial: Complete guide on deploying @TheNatsukiBot's clone on Heroku. ☆ Video by Sadew Jayas

Pawan Theekshana 8 Oct 06, 2022
Project made to analyse movie trends

MovieTrends Project to analyse the daily movie trends from the website The Movie DataBase. The main idea is upload the results to a PostgreSQL server

Jazmín López Chacón 0 Feb 15, 2022
Properly-formatted dynamic timestamps for Discord messages

discord-timestamps discord-timestamps generates properly-formatted dynamic timestamps for Discord messages, with support for Arrow objects. format

Ben Soyka 2 Mar 10, 2022
The Python SDK for the BattleshAPI game

BattleshAPy The Python SDK for the BattleshAPI game Installation This library will be eventually going on PyPI, but until then, simply clone or downlo

Christopher 0 Apr 18, 2022
BleachBit system cleaner for Windows and Linux

BleachBit BleachBit cleans files to free disk space and to maintain privacy. Running from source To run BleachBit without installation, unpack the tar

1.9k Jan 06, 2023
Download apps and remove icloud

Download apps and remove icloud

POC de uma AWS lambda que executa a consulta de preços de criptomoedas, e é implantada na AWS usando Github actions.

Cryptocurrency Prices Overview Instalação Repositório Configuração CI/CD Roadmap Testes Overview A ideia deste projeto é aplicar o conteúdo estudado s

Gustavo Santos 3 Aug 31, 2022