L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
To send an Instagram message using Python

To send an Instagram message using Python, you must have an Instagram account and install the Instabot library in your Python virtual environment.

Coding Taggers 1 Dec 18, 2021
A virus/stealer made in py

python-virus A virus/stealer made in py. Features: Discord token stealer, Password stealer, Windows key stealer, Credit-card stealer, Image grab, Anti

SKYNETMARCI 5 Dec 12, 2022
Demo to explain how to use AWS Chalice to connect to twitter API and change profile picture at scheduled times.

chalice-twitter-demo Demo to explain how to use AWS Chalice to connect to twitter API and change profile picture at scheduled times. Video Demo Click

Ahmed Mohamed 4 Dec 13, 2021
Python written Rule34 API

Python written Rule34 API

1 Nov 11, 2021
A tool written in Python used to instalock agents in VALORANT using the local API.

Valorant Instalock Tool v2.1.0 by Mr. SOSA A tool written in Python used to instalock agents in VALORANT using the local API. This is NOT a hotkey pro

Mr. SOSA 3 Nov 18, 2021
The official Python library for Shodan

shodan: The official Python library and CLI for Shodan Shodan is a search engine for Internet-connected devices. Google lets you search for websites,

John Matherly 2.1k Dec 31, 2022
This is Telegram Files Store Bot by @AbirHasan2005

PyroFilesStoreBot This is Telegram Parmanent Files Store Bot by @AbirHasan2005. Language: Python3 Library: Pyrogram Features: In PM Just Forward or Se

Abir Hasan 168 Dec 19, 2022
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 83 Dec 12, 2022
NiceHash Python Library and Command Line Rest API

NiceHash Python Library and Command Line Rest API Requirements / Modules pip install requests Required data and where to get it Following data is nee

Ashlin Darius Govindasamy 2 Jan 02, 2022
A discord nuking tool made by python, this also has nuke accounts, inbuilt Selfbot, Massreport, Token Grabber, Nitro Sniper and ALOT more!

Disclaimer: Rage Multi Tool was made for Educational Purposes This project was created only for good purposes and personal use. By using Rage, you agr

†† 50 Jul 19, 2022
a small cli to generate AWS Well Architected Reports on the road

well-architected-review This repo intends to publish some scripts related to Well Architected Reviews. war.py extracts in txt & xlsx files all the WAR

4 Mar 18, 2022
A discord bot wrapper for python have slash command

A discord bot wrapper for python have slash command

4 Dec 04, 2021
Discord RPC for Notion written in Python

Discord RPC for Notion This is a program that allows you to add your Notion workspace activities to your Discord profile. This project is currently un

Thuliumitation 1 Feb 10, 2022
Pretend to be a discord bot

Pretendabot © Pretend to be a discord bot! About Pretendabot© is an app that lets you become a discord bot!. It uses discord intrigrations(webhooks) a

Advik 3 Apr 24, 2022
Requests based multi-threaded script for increasing followers on Spotify

Proxyless Spotify Follow Bot Requests based multi-threaded script for increasing followers on Spotify. Click here to report bugs. Usage Download ZIP h

397 Jan 03, 2023
Telegram Reporter

[Telegram Reporter v.3 ] 🇮🇷 AliCybeRR 🇮🇷 [ AliCybeRR.Reporter feature ] Login Your Telegram account 👽 support Termux ❕ No Limits ⚡ Secure 🔐 Free

AliCybeRR 1 Jun 08, 2022
BingBot - A bot that will automate searches on bing

bingBot A bot that will automate searches on bing. To install this just download

Lukas 2 Jul 28, 2022
Weather_besac is a French twitter bot that tweet the weather of the city of Besançon in Franche-Comté in France every day at 8am and 4pm.

Weather Bot Besac Weather_besac is a French twitter bot that tweet the weather of the city of Besançon in Franche-Comté in France every day at 8am and

Rgld_ 1 Nov 15, 2021
Get charts, top artists and top songs WITHOUT LastFM API

LastFM Get charts, top artists and top songs WITHOUT LastFM API Usage Get stats (charts) We provide many filters and options to customize. Geo filter

4 Feb 11, 2022
This project checks the weather in the next 12 hours and sends an SMS to your phone number if it's going to rain to remind you to take your umbrella.

RainAlert-Request-Twilio This project checks the weather in the next 12 hours and sends an SMS to your phone number if it's going to rain to remind yo

9 Apr 15, 2022