Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Last update: Dec 15, 2022

Related tags

Deep Learning Language-Identifier

Overview

Language Identifier

What is this ?

The goal of this project is to create a model that is able to predict a given sentence language through text processing, including tokenizing and representation of sentences as vectors and applying concepts such as RNN, LSTM and GRU to create the classifier that can detect the language among 17 languages.

Dataset

Language Detection It's a small language detection dataset. This dataset consists of text details for 17 different languages

Results

All models achieved high accuracy even when using one convolution layer instead of LSTM or GRU, But GRU achieved highest accuracy 99% training accuracy 94% validation accuracy.
Using convlution layer achieved high accuracy about 95% validation accuracy
Using fewer embedding dimensions makes the model reach high accuracy faster but in Embedding Projector alot of words grouped with other languages.

32 Embedding dimensions examples

3 Embedding dimensions examples

GRU Accuracy and Loss

GRU Confusion matrix

Libraries

Tensorflow
Scikit-learn
NumPy
Pandas
Matplotlib

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Related tags

Overview

Language Identifier

What is this ?

Dataset

Results

32 Embedding dimensions examples

3 Embedding dimensions examples

GRU Accuracy and Loss

GRU Confusion matrix

Libraries

Owner

Hossam Asaad

Imaging, analysis, and simulation software for radio interferometry

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

Flexible-Modal Face Anti-Spoofing: A Benchmark

[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.

This is a simple framework to make object detection dataset very quickly

Springer Link Download Module for Python

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

Multistream CNN for Robust Acoustic Modeling

PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

This is a vision-based 3d model manipulation and control UI

Autoencoder - Reducing the Dimensionality of Data with Neural Network

Open-World Entity Segmentation

State of the Art Neural Networks for Generative Deep Learning

PyImpetus is a Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features

Invariant Causal Prediction for Block MDPs

Neural style transfer in PyTorch.

Sibur challange 2021 competition - 6 place

GLIP: Grounded Language-Image Pre-training