Baseline of DCASE 2020 task 4

Last update: Oct 18, 2022

Related tags

Deep Learning dcase20_task4

Overview

Couple Learning for SED

This repository provides the data and source code for sound event detection (SED) task.
The improvement of the Couple Learning method is verified on the basis of the dcase20-task4 baseline.
Information about the dcase20-task4 please visit github.
Information about Couple Learning please visit paper: Couple Learning: Mean Teacher method with pseudo-labels improves semi-supervised deep learning results.

Couple Learning model

More info in the PLG-MT_run folder.

Reproducing the results

See PLG-MT_run folder.

Dependencies

Python >= 3.6, pytorch >= 1.0, cudatoolkit>=9.0, pandas >= 0.24.1, scipy >= 1.2.1, pysoundfile >= 0.10.2, scaper >= 1.3.5, librosa >= 0.6.3, youtube-dl >= 2019.4.30, tqdm >= 4.31.1, ffmpeg >= 4.1, dcase_util >= 0.2.5, sed-eval >= 0.2.1, psds-eval >= 0.1.0, desed >= 1.3.0

A simplified installation procedure example is provided below for python 3.6 based Anconda distribution for Linux based system:

install Ananconda
launch conda_create_environment.sh (recommended line by line)

Dataset

All the scripts to get the data (soundbank, generated, separated) are in the scripts folder and they use python files from data_generation folder.

Scripts to generate the dataset

In the scripts/ folder, you can find the different steps to:

Download recorded data and synthetic material.
Generate synthetic soundscapes
Reverberate synthetic data (Not used in the baseline)
Separate sources of recorded and synthetic mixtures

It is likely that you'll have download issues with the real recordings. At the end of the download, please send a mail with the TSV files created in the missing_files directory.

However, if none of the audio files have been downloaded, it is probably due to an internet, proxy problem. See Desed repo or Desed_website for more info.

Base dataset

The dataset for sound event detection of DCASE2020 task 4 is composed of:

Train:
- *weak (DESED, recorded, 1 578 files)
- *unlabel_in_domain (DESED, recorded, 14 412 files)
- synthetic soundbank (DESED, synthetic, 2060 background (SINS only) + 1006 foreground files)
*Validation (DESED, recorded, 1 168 files):
- test2018 (288 files)
- eval2018 (880 files)

Baselines dataset

SED baseline

Train:
- weak
- unlabel_in_domain
- synthetic20/soundscapes (separated in train/valid-80%/20%)
Validation:
- validation

Baseline of DCASE 2020 task 4

Related tags

Overview

Couple Learning for SED

Couple Learning model

Reproducing the results

Dependencies

Dataset

Scripts to generate the dataset

Base dataset

Baselines dataset

SED baseline

Owner

CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Code for CPM-2 Pre-Train

Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

Unsupervised captioning - Code for Unsupervised Image Captioning

CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

CVPR2021 Content-Aware GAN Compression

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

JDet is Object Detection Framework based on Jittor.

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Sharpness-Aware Minimization for Efficiently Improving Generalization

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

A robotic arm that mimics hand movement through MediaPipe tracking.

This is the reference implementation for "Coresets via Bilevel Optimization for Continual Learning and Streaming"

Deep motion transfer

Baseline of DCASE 2020 task 4

Related tags

Overview

Couple Learning for SED

Couple Learning model

Reproducing the results

Dependencies

Dataset

Scripts to generate the dataset

Base dataset

Baselines dataset

SED baseline

Owner

CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Code for CPM-2 Pre-Train

Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

Unsupervised captioning - Code for Unsupervised Image Captioning

CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

CVPR2021 Content-Aware GAN Compression

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

JDet is Object Detection Framework based on Jittor.

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Sharpness-Aware Minimization for Efficiently Improving Generalization

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

A robotic arm that mimics hand movement through MediaPipe tracking.

This is the reference implementation for "Coresets via Bilevel Optimization for Continual Learning and Streaming"

Deep motion transfer

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .