African language Speech Recognition - Speech-to-Text

Overview

Swahili-Speech-To-Text

Table of Contents

Overview

This repository is used for week 4 challenge of 10Academy. The instructions for this project can be found in the challenge document.

Scenario

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in two different countries in Africa - Ethiopia and Kenya. The design of this intelligent form requires selected people to install an app on their mobile phone, and whenever they buy food, they use their voice to activate the app to register the list of items they just bought in their own language. The intelligent systems in the app are expected to live to transcribe the speech-to-text and organize the information in an easy-to-process way in a database.

You work for the Tenacious data science consultancy, which is chosen to deliver speech-to-text technology for Swahili. Your responsibility is to build a deep learning model that is capable of transcribing a speech to text. The model you produce should be accurate and is robust against background noise.

Approach

The project is divided and implemented by the following phases

  • Data pre-processing
  • Modelling using deep learning
  • Serving predictions on a web interface
  • Interpretation & Reporting

Project Structure

The repository has a number of files including python scripts, jupyter notebooks, pdfs and text files. Here is their structure with a brief explanation.

data:

  • the folder where the dataset csv files are stored

models:

  • the folder where models' pickle files are stored

notebooks:

scripts

  • app_logger.py: a python script for logging
  • file_handler.py: a python script for handling reading and writing of csv, pickle and other files

tests:

  • the folder containing unit tests for components in the scripts

logs:

  • the folder containing log files (if it doesn't exist it will be created once logging starts)

root folder

Installation guide

git clone https://github.com/10-Academy-Batch-4-Week-4/Swahili-Speech-To-Text
cd Swahili-Speech-To-Text
pip install -r requirements.txt
Owner
African language Speech Recognition - Speech-to-Text (GROUP 5)
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF

xuxudong 97 Nov 10, 2022
HGCAE Pytorch implementation. CVPR2021 accepted.

Hyperbolic Graph Convolutional Auto-Encoders Accepted to CVPR2021 🎉 Official PyTorch code of Unsupervised Hyperbolic Representation Learning via Mess

Junho Cho 37 Nov 13, 2022
CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

Spatially-Correlative Loss arXiv | website We provide the Pytorch implementation of "The Spatially-Correlative Loss for Various Image Translation Task

Chuanxia Zheng 89 Jan 04, 2023
Makes patches from huge resolution .svs slide files using openslide

openslide_patcher Makes patches from huge resolution .svs slide files using openslide Example collage I made from outputs:

2 Dec 23, 2021
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

Wang Han 王晗 1.3k Jan 08, 2023
A new data augmentation method for extreme lighting conditions.

Random Shadows and Highlights This repo has the source code for the paper: Random Shadows and Highlights: A new data augmentation method for extreme l

Osama Mazhar 35 Nov 26, 2022
22 Oct 14, 2022
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

VirTex: Learning Visual Representations from Textual Annotations Karan Desai and Justin Johnson University of Michigan CVPR 2021 arxiv.org/abs/2006.06

Karan Desai 533 Dec 24, 2022
DilatedNet in Keras for image segmentation

Keras implementation of DilatedNet for semantic segmentation A native Keras implementation of semantic segmentation according to Multi-Scale Context A

303 Mar 15, 2022
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

Daniel Zügner 131 Dec 13, 2022
A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

AlphaFold Analyser This program produces high quality visualisations of predicted structures produced by AlphaFold. These visualisations allow the use

Oliver Powell 3 Nov 13, 2022
Explainer for black box models that predict molecule properties

Explaining why that molecule exmol is a package to explain black-box predictions of molecules. The package uses model agnostic explanations to help us

White Laboratory 172 Dec 19, 2022
Hand-distance-measurement-game - Hand Distance Measurement Game

Hand Distance Measurement Game This is program is made to calculate the distance

Priyansh 2 Jan 12, 2022
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening Introduction This is an implementation of the model used for breast

757 Dec 30, 2022
DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

DI-smartcross DI-smartcross - Decision Intelligence Platform for Traffic Crossin

OpenDILab 213 Jan 02, 2023
Fully-automated scripts for collecting AI-related papers

AI-Paper-collector Fully-automated scripts for collecting AI-related papers List of Conferences to crawel ACL: 21-19 (including findings) EMNLP: 21-19

Gordon Lee 776 Jan 08, 2023
SNIPS: Solving Noisy Inverse Problems Stochastically

SNIPS: Solving Noisy Inverse Problems Stochastically This repo contains the official implementation for the paper SNIPS: Solving Noisy Inverse Problem

Bahjat Kawar 35 Nov 09, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 111 Dec 23, 2022
code for our ECCV 2020 paper "A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation"

Code for our ECCV (2020) paper A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation. Prerequisites: python == 3.6.8 pytorch ==1.1.0

32 Nov 27, 2022
Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

MIT CSAIL Computer Vision 424 Dec 01, 2022