Code for csig audio deepfake detection

Related tags

AudioCSIG_audio
Overview

FMFCC Audio Deepfake Detection Solution

This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection track . The ranking can be seen here

Authors

Institution: Shenzhen Key Laboratory of Media Information Content Security(MICS)

Team name: Forensics_SZU

Username:

Pipeline

EfficientNet-B2 + mel features (不同mel参数的训练两个模型ensemble)

Requirements:

  • 激活已经配置好的环境[建议使用这个]
conda activate audio
  • 或者执行这个重新配置
pip install -r requirements.txt
  • 可能存在部分包漏了安装,可以通过pip install 方式继续安装完整

Submission

  • cd到代码位置:/home/audio5/py_project/CSIG_audio
cd /home/audio5/py_project/CSIG_audio
  • 接着,修改inference.py 文件的第98行中的root_path路劲为自己语音测试集的路劲

1. 预测第一个模型

  • 然后执行
python inference.py
  • 运行结束后,可以在 "./output/results/efficientnet-b2_96_k2e6_hop48_t' 看到results.json文件

2. 预测第二个模型

  • 2.1 修改json_root的路劲:把inference.py的105行的代码注释掉, 然后解除106行的注释,更新了第二个模型预测结果保存的路劲,修改后的代码如下
105        # json_root = f'./output/results/{model_name}_96_k2e6_hop48_t'
106        json_root = f'./output/results/{model_name}_96_k2e9_t'
  • 2.2 修改model_path的路劲(即改变模型权值):把inference.py的114行的代码注释掉, 然后解除115行的注释,更新了模型权值的路劲,修改后的代码如下
114        # model_path = './output/weights/efficientnet-b2_96_hop48_k2/audio6_acc0.9832.pth'
115        model_path = './output/weights/efficientnet-b2_96_k2/audio9_acc0.9752.pth'
  • 2.3 修改mel feature的配置参数:把dataset/preprocess/vggish_params.py中的 35行代码"EXAMPLE_HOP_SECONDS = 0.48" 改为"EXAMPLE_HOP_SECONDS = 0.96" 特别要注意EXAMPLE_HOP_SECONDS这个参数,验证结束后,需要重新修改为0.48, 方便最终决赛运行
  • 修改上述三点后,然后执行, 另外前面的两个修改也需要恢复。
python inference.py
  • 运行结束后,可以在 "./output/results/efficientnet-b2_96_k2e9_t' 看到results.json文件

Ensemble

  • 集成刚刚生成的两个json文件:把inference.py 第88行的"json_ensemble = False"改为"json_ensemble = True"
  • 修改后执行:
python inference.py
  • 运行结束后,可以在 "./output/results/ensemble_01_t' 看到results.json文件,该文件就是我们最终提交的结果json文件

注意:验证结束后,需要恢复上述修改到原始状态,方便最终的决赛

Owner
BokingChen
BokingChen
SinGlow: Generative Flow for SVS tasks in Tensorflow 2

SinGlow is a part of my Singing voice synthesis system. It can extract features of sound, particularly songs and musics. Then we can use these features (or perfect encoding) for feature migrating tas

Haobo Yang 8 Aug 22, 2022
Terminal-based audio-to-text converter

att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa

Sven Eschlbeck 4 Dec 15, 2022
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee

Mozilla 20.8k Jan 03, 2023
Delta TTA(Text To Audio) SoftWare

Text-To-Audio-Windows Delta TTA(Text To Audio) SoftWare Info You Can Use It For Convert Your Text To Audio File You Just Write Your Text And Your End

Delta Inc. 2 Dec 14, 2021
Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

David Mainoo 1 Oct 29, 2021
This Bot can extract audios and subtitles from video files

Send any valid video file and the bot shows you available streams in it that can be extracted!!

TroJanzHEX 56 Nov 22, 2022
Pythonic bindings for FFmpeg's libraries.

PyAV PyAV is a Pythonic binding for the FFmpeg libraries. We aim to provide all of the power and control of the underlying library, but manage the gri

PyAV 1.8k Jan 03, 2023
Bot duniya Music Player

Bot duniya Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) 2nd Telegram Account (ne

Aman Vishwakarma 16 Oct 21, 2022
Simple, hackable offline speech to text - using the VOSK-API.

Nerd Dictation Offline Speech to Text for Desktop Linux. This is a utility that provides simple access speech to text for using in Linux without being

Campbell Barton 844 Jan 07, 2023
Audio fingerprinting and recognition in Python

dejavu Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here: How it works Dejavu can memorize audio by liste

Will Drevo 6k Jan 06, 2023
controls volume using hand gestures

controls volume using hand gestures

1 Oct 11, 2021
commonfate 📦commonfate 📦 - Common Fate Model and Transform.

Common Fate Transform and Model for Python This package is a python implementation of the Common Fate Transform and Model to be used for audio source

Fabian-Robert Stöter 18 Jan 08, 2022
Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

Gammatone Filterbank Toolkit Utilities for analysing sound using perceptual models of human hearing. Jason Heeris, 2013 Summary This is a port of Malc

Jason Heeris 188 Dec 14, 2022
An audio digital processing toolbox based on a workflow/pipeline principle

AudioTK Audio ToolKit is a set of audio filters. It helps assembling workflows for specific audio processing workloads. The audio workflow is split in

Matthieu Brucher 238 Oct 18, 2022
A2DP agent for promiscuous/permissive audio sinc.

Promiscuous Bluetooth audio sinc A2DP agent for promiscuous/permissive audio sinc for Linux. Once installed, a Bluetooth client, such as a smart phone

Jasper Aorangi 4 May 27, 2022
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 02, 2023
Real-Time Spherical Microphone Renderer for binaural reproduction in Python

ReTiSAR Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python [1][2]. Contents: | Requirements | Setup | Q

Division of Applied Acoustics at Chalmers University of Technology 51 Dec 17, 2022
Inner ear models for Python

cochlea cochlea is a collection of inner ear models. All models are easily accessible as Python functions. They take sound signal as input and return

98 Jan 05, 2023
[Singing Log] Let your program learn to sing!

[Singing Log] Let your program learn to sing! You must have thought this was changelog when you saw the English title, but it's not, it's chànggēlog. What it does is allow your program to print logs

黄巍 22 Sep 03, 2022
❤️ This Is The EzilaXMusicPlayer Advaced Repo 🎵

Telegram EzilaXMusicPlayer Bot 🎵 A bot that can play music on telegram group's voice Chat ❤️ Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7+

Sadew Jayasekara 11 Nov 12, 2022