Doing the asl sign language classification on static images using graph neural networks.

Overview

SignLangGNN

When GNNs 💜 MediaPipe. This is a starter project where I tried to implement some traditional image classification problem i.e. the ASL sign language classification problem. The twist here is we used the graph generated from the hand images using mediapipe. And the graph I got, I extrated the {x, y, z} co-ordinates of the nodes and also the edge index for the connecteion and translated this image classification problem to a graph classiciation problem.

Project Structure

--------- Data
            |___ CSVs # containing the co-ordinates of per images
            |___ raw
                   |___ train.csv
                   |___ valid.csv
                   |___ test.csv 
            |___ ImageData
                   |___ asl_alphabet_test
                            |___ A/
                            |___ B/ 
                            ....
                            |___ space

                   |___ asl_alphabet_train
            |
            |___ Models # the GNN models
            |___ src
                   |__ dataset.py # pyg custom data
                   |__ train.py   # train loop
                   |__ utils.py   # different utility functions
            |
            |___ main.py # from data to train
            |___ run.py  # real time video visualization

I used PyTorch geometric and PyTorch for the project. To view the results in details head over to the IPYNB folder and see the first IPYNB file. To run this project first clone this repo using this command:

git clone https://github.com/Anindyadeep/SignLangGNN

After that run the main.py using this command. Other things will be managed automatically, provided al,l the essential libraries are installed.

python3 main.py

Initial Results

The traning and validation process went smooth as with a very simple base model it gave an train acc of 0.85 and validation acc of 0.86. It also provided an test acc of 0.84. The model was run for 8 epochs. The model also gets confused with some sort of examples and we can say that it currently suffers from adverserial attacks.

Improvements

These are the improvements we can do with this project:

  1. Improved GNN models. We can make more robust and complex models and improve the performance.

  2. Adding edge features. Some of the edge features like distance between two nodes and the angle between two nodes could produce some potential improvements to the performance of our model.

Future Works

Using Temporal Graph Neural Nets could make more robust and accurate model for this kind of problem. But for that we need temporal data like videos instaed of images, so that we could generate static temporal graphs and compute on them as a dynamic graph sequence problem.

Owner
Deep learning enthusiast, like to know something new every time....
QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Google 437 Jan 03, 2023
Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

AutoInt: Automatic Integration for Fast Neural Volume Rendering CVPR 2021 Project Page | Video | Paper PyTorch implementation of automatic integration

Stanford Computational Imaging Lab 149 Dec 22, 2022
GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions. This is the official code release fo

owl 37 Dec 24, 2022
Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

peng gao 42 Nov 26, 2022
Learning to Prompt for Continual Learning

Learning to Prompt for Continual Learning (L2P) Official Jax Implementation L2P is a novel continual learning technique which learns to dynamically pr

Google Research 207 Jan 06, 2023
[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

SSVC The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning] samples of the

7 Oct 26, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect. It handles Algerian

117 Jan 07, 2023
Graph Transformer Architecture. Source code for

Graph Transformer Architecture Source code for the paper "A Generalization of Transformer Networks to Graphs" by Vijay Prakash Dwivedi and Xavier Bres

NTU Graph Deep Learning Lab 561 Jan 08, 2023
Keras Image Embeddings using Contrastive Loss

Keras-Image-Embeddings-using-Contrastive-Loss Image to Embedding projection in vector space. Implementation in keras and tensorflow for custom data. B

Shravan Anand K 5 Mar 21, 2022
Auto-updating data to assist in investment to NEPSE

Symbol Ratios Summary Sector LTP Undervalued Bonus % MEGA Strong Commercial Banks 368 5 10 JBBL Strong Development Banks 568 5 10 SIFC Strong Finance

Amit Chaudhary 16 Nov 01, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
Azion the best solution of Edge Computing in the world.

Azion Edge Function docker action Create or update an Edge Functions on Azion Edge Nodes. The domain name is the key for decision to a create or updat

8 Jul 16, 2022
Filtering variational quantum algorithms for combinatorial optimization

Current gate-based quantum computers have the potential to provide a computational advantage if algorithms use quantum hardware efficiently.

1 Feb 09, 2022
Efficient semidefinite bounds for multi-label discrete graphical models.

Low rank solvers #################################### benchmark/ : folder with the random instances used in the paper. ############################

1 Dec 08, 2022
Code release for ICCV 2021 paper "Anticipative Video Transformer"

Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page

Facebook Research 123 Dec 13, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
Winners of the Facebook Image Similarity Challenge

Winners of the Facebook Image Similarity Challenge

DrivenData 111 Jan 05, 2023
Türkiye Canlı Mobese Görüntülerinde Profesyonel Nesne Takip Sistemi

Türkiye Mobese Görüntü Takip Türkiye Mobese görüntülerinde OPENCV ve Yolo ile takip sistemi Multiple Object Tracking System in Turkish Mobese with OPE

15 Dec 22, 2022
Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

Densely Connected Convolutional Networks (DenseNets) This repository contains the code for DenseNet introduced in the following paper Densely Connecte

Zhuang Liu 4.5k Jan 03, 2023
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

Alan Grijalva 49 Dec 20, 2022