👐OpenHands : Making Sign Language Recognition Accessible (WiP 🚧👷‍♂️🏗)

Last update: Dec 12, 2022

Related tags

Deep Learning OpenHands

Overview

👐 OpenHands: Sign Language Recognition Library

Making Sign Language Recognition Accessible

Check the documentation on how to use the library:
ReadTheDocs: 👐 OpenHands

License

This project is released under the Apache 2.0 license.

Citation

If you find our work useful in your research, please consider citing us:

@misc{2021_openhands_slr_preprint,
      title={OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages}, 
      author={Prem Selvaraj and Gokul NC and Pratyush Kumar and Mitesh Khapra},
      year={2021},
      eprint={2110.05877},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Comments

Question about GSL dataset

I have no idea how to get the Isolated gloss sign language recognition (GSL isol.) data (xxx_signerx_repx_glosses), while I only find the continuous sign language recognition data (xxx_signerx_repx_sentences) from https://zenodo.org/record/3941811.

Thank you very much for any information about this.

opened by snorlaxse 6

Question about 'Config-based training'

I try the code from Config-based training as below.

import omegaconf
from openhands.apis.classification_model import ClassificationModel
from openhands.core.exp_utils import get_trainer
import os 

os.environ["CUDA_VISIBLE_DEVICES"]="2,3"
cfg = omegaconf.OmegaConf.load("examples/configs/lsa64/decoupled_gcn.yaml")
trainer = get_trainer(cfg)


model = ClassificationModel(cfg=cfg, trainer=trainer)
model.init_from_checkpoint_if_available()
model.fit()

/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:747: UserWarning: You requested multiple GPUs but did not specify a backend, e.g. `Trainer(accelerator="dp"|"ddp"|"ddp2")`. Setting `accelerator="ddp_spawn"` for you.
  "You requested multiple GPUs but did not specify a backend, e.g."
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/raid/xxx/OpenHands/openhands/apis/inference.py:21: LightningDeprecationWarning: The `LightningModule.datamodule` property is deprecated in v1.3 and will be removed in v1.5. Access the datamodule through using `self.trainer.datamodule` instead.
  self.datamodule.setup(stage=stage)
Found 64 classes in train splits
Found 64 classes in test splits
Train set size: 2560
Valid set size: 320
/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py:424: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup.
  f"DataModule.{name} has already been called, so it will not be called again. "
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [2,3]
Traceback (most recent call last):
  File "study_train.py", line 15, in <module>
    model.fit()
  File "/raid/xxx/OpenHands/openhands/apis/classification_model.py", line 104, in fit
    self.trainer.fit(self, self.datamodule)
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
    self._run(model)
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 917, in _run
    self._dispatch()
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 985, in _dispatch
    self.accelerator.start_training(self)
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 158, in start_training
    mp.spawn(self.new_process, **self.mp_spawn_kwargs)
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/raid/xxx/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 148, in start_processes
    process.start()
  File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/raid/xxx/anaconda3/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'DecoupledGCN_TCN_unit.__init__.<locals>.<lambda>'
(base)

opened by snorlaxse 4

installation issue
Hello, thank you for providing such a great framework, but there was an error when I import the module. Could you please offer me a help? code:

import omegaconf from openhands.apis.classification_model import ClassificationModel from openhands.core.exp_utils import get_trainer cfg = omegaconf.OmegaConf.load("1.yaml") trainer = get_trainer(cfg) model = ClassificationModel(cfg=cfg, trainer=trainer) model.init_from_checkpoint_if_available() model.fit()

ERROR: Traceback (most recent call last): File "/home/hxz/project/pose_SLR/main.py", line 3, in from openhands.apis.classification_model import ClassificationModel ModuleNotFoundError: No module named 'openhands.apis'
opened by Xiaolong-han 4
visibility object

https://github.com/narVidhai/SLR/blob/2f26455c7cb530265618949203859b953224d0aa/scripts/mediapipe_extract.py#L48

Doesn't this object contain visibility value as well. If so, we could add some logic for conditioning and merge it with the above function
enhancement

opened by grohith327 3

About the wrong st_gcn checkpoints files provided on GSL

import omegaconf
from openhands.apis.inference import InferenceModel

cfg = omegaconf.OmegaConf.load("GSL/gsl/st_gcn/config.yaml")
model = InferenceModel(cfg=cfg)
model.init_from_checkpoint_if_available()
if cfg.data.test_pipeline.dataset.inference_mode:
    model.test_inference()
else:
    model.compute_test_accuracy()

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_6585/2983784194.py in <module>
      4 cfg = omegaconf.OmegaConf.load("GSL/gsl/st_gcn/config.yaml")
      5 model = InferenceModel(cfg=cfg)
----> 6 model.init_from_checkpoint_if_available()
      7 if cfg.data.test_pipeline.dataset.inference_mode:
      8     model.test_inference()

~/OpenHands/openhands/apis/inference.py in init_from_checkpoint_if_available(self, map_location)
     47         print(f"Loading checkpoint from: {ckpt_path}")
     48         ckpt = torch.load(ckpt_path, map_location=map_location)
---> 49         self.load_state_dict(ckpt["state_dict"], strict=False)
     50         del ckpt
     51 

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
   1050         if len(error_msgs) > 0:
   1051             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
-> 1052                                self.__class__.__name__, "\n\t".join(error_msgs)))
   1053         return _IncompatibleKeys(missing_keys, unexpected_keys)
   1054 

RuntimeError: Error(s) in loading state_dict for InferenceModel:
	size mismatch for model.encoder.A: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.st_gcn_networks.0.gcn.conv.weight: copying a param with shape torch.Size([128, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 2, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.0.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
	size mismatch for model.encoder.st_gcn_networks.1.gcn.conv.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 64, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.1.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
	size mismatch for model.encoder.st_gcn_networks.2.gcn.conv.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 64, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.2.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
	size mismatch for model.encoder.st_gcn_networks.3.gcn.conv.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 64, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.3.gcn.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([192]).
	size mismatch for model.encoder.st_gcn_networks.4.gcn.conv.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 64, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.4.gcn.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([384]).
	size mismatch for model.encoder.st_gcn_networks.5.gcn.conv.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 128, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.5.gcn.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([384]).
	size mismatch for model.encoder.st_gcn_networks.6.gcn.conv.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 128, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.6.gcn.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([384]).
	size mismatch for model.encoder.st_gcn_networks.7.gcn.conv.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 128, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.7.gcn.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for model.encoder.st_gcn_networks.8.gcn.conv.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 256, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.8.gcn.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for model.encoder.st_gcn_networks.9.gcn.conv.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([768, 256, 1, 1]).
	size mismatch for model.encoder.st_gcn_networks.9.gcn.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for model.encoder.edge_importance.0: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.1: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.2: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.3: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.4: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.5: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.6: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.7: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.8: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).
	size mismatch for model.encoder.edge_importance.9: copying a param with shape torch.Size([2, 27, 27]) from checkpoint, the shape in current model is torch.Size([3, 27, 27]).

opened by snorlaxse 2

Refactoring code to remove bugs, code smells
Following changes were made as part of the PR

Refactored data loading component of the package

Created/Renamed files under slr.datasets.isolated to allow for modularization

added __init__.py files for importing
opened by grohith327 1
ST-GCN does not work for mediapipe

Currently the openpose layout seems to be hardcoded in graph_utils.py's Graph class. Should we also add a layout for mediapipe, or pass the joints via yml?
bug

opened by GokulNC 1
Scale normalization for pose

For example, if the signer is moving forward or backward in the video, this augmentation will help normalize the scale throughout the video: https://github.com/AmitMY/pose-format#data-normalization

Will involve explicitly specifying the joint (edge) based on which scaling has to be performed.
enhancement

opened by GokulNC 1
Function not called

https://github.com/narVidhai/SLR/blob/2f26455c7cb530265618949203859b953224d0aa/scripts/mediapipe_extract.py#L129

Is this function not called anywhere?
question

opened by grohith327 1
Support for GCN + BERT model

Add the model proposed in

https://openaccess.thecvf.com/content/WACV2021W/HBU/papers/Tunga_Pose-Based_Sign_Language_Recognition_Using_GCN_and_BERT_WACVW_2021_paper.pdf
enhancement

opened by Prem-kumar27 0
Sinusoidal Train/Val Accuracy

I'm noticing that the transformer and the SL-GCN architectures, while learning on WLASL2000, have an accuracy curve that resembles a sine curve with period of about 20 epochs and amplitude of about 5-10%. I am using the example config provided in the repo, and verified that the batches are being shuffled. I have also played around with logging on_step=True in case this is an artifact of torch.nn.log, but that didn't help either. Any ideas why this is happening?

opened by leekezar 1
Lower accuracy when inferring a single video

Hello,

When I supply the inference model with multiple videos, the model predicts all of them right. But if I supply only one video then the prediction is wrong. I am curious about the cause of this? Can anyone please explain?

Thank you!

opened by burakkaraceylan 1
Using `pose-format` for consistent `.pose` files

Seems like for pose data you are using pkl and h5. Also, that you have a custom mediapipe holistic script

Personally I believe it would be more shareable, and faster, to use a binary format like https://github.com/AmitMY/pose-format Every pose file also declares its content, so you can transfer them between projects, or convert them to different formats with relative is.

Besides the fact that it has a holistic loading script and multiple formats of OpenPose, it is a binary format which is faster to load, allows loading to numpy, torch and tensorflow, and can perform several operations on poses.

It also allows the visualization of pose files, separately or on top of videos, and while admittedly this repository is not perfect, in my opinion it is better than having json or pkl files.

opened by AmitMY 9
Consistent Dataset Handling

Very nice repo and documentation!

I think this repository can benefit from using https://github.com/sign-language-processing/datasets as data loaders.

It is fast, consistent across datasets, and allows loading videos / poses from multiple datasets. If a dataset you are using is not there, you can ask for it or add it yourself, it is a breeze.

The repo supports many datasets, multiple pose estimation formats, binary pose files, fps and resolution manipulations, and dataset disk mapping.

Finally, this would make this repo less complex. This repo does pre-training and fine-tuning, the other repo does datasets, and they could be used together.

Please consider :)

opened by AmitMY 5
Resume training, but load only parameters

Not the entire state stored by Lightning.

Use an option called pretrained to achieve it, like this: https://github.com/AI4Bharat/OpenHands/blob/26c17ed0fca2ac786950d1f4edfa5a88419d06e6/examples/configs/include/decoupled_gcn.yaml#L1
important feature

opened by GokulNC 1

Releases(checkpoints_v1)

checkpoints_v1(Oct 9, 2021)

Creating a release to upload all checkpoints
Source code(tar.gz)
Source code(zip)
autsl_bert.zip(116.59 MB)
autsl_lstm.zip(17.67 MB)
autsl_metadata.zip(151.99 KB)
autsl_slgcn.zip(38.71 MB)
autsl_stgcn.zip(32.99 MB)
csl_bert.zip(45.74 MB)
csl_lstm.zip(18.59 MB)
csl_metadata.zip(2.98 KB)
csl_slgcn.zip(38.59 MB)
csl_stgcn.zip(31.51 MB)
devisign_bert.zip(47.69 MB)
devisign_dpc.zip(38.81 MB)
devisign_lstm.zip(22.64 MB)
devisign_metadata.zip(788.62 KB)
devisign_slgcn.zip(42.90 MB)
devisign_stgcn.zip(35.62 MB)
gsl_bert.zip(45.56 MB)
gsl_lstm.zip(17.93 MB)
gsl_metadata.zip(666.01 KB)
gsl_slgcn.zip(38.56 MB)
gsl_stgcn.zip(30.98 MB)
include_bert.zip(40.00 MB)
include_dpc.zip(33.98 MB)
include_lstm.zip(17.90 MB)
include_metadata.zip(45.93 KB)
include_slgcn.zip(38.72 MB)
include_stgcn.zip(30.79 MB)
lsa64_bert.zip(45.27 MB)
lsa64_dpc.zip(36.78 MB)
lsa64_lstm.zip(17.28 MB)
lsa64_metadata.zip(798 bytes)
lsa64_slgcn.zip(38.18 MB)
lsa64_stgcn.zip(30.24 MB)
raw_dpc.zip(34.47 MB)
sign_detection_lstm.pt(93.33 KB)
wlasl_bert.zip(22.03 MB)
wlasl_dpc.zip(38.79 MB)
wlasl_lstm.zip(13.99 MB)
wlasl_metadata.zip(1.65 MB)
wlasl_slgcn.zip(43.19 MB)
wlasl_stgcn.zip(37.86 MB)

Owner

AI4Bhārat

Building open-source AI solutions for India!

GitHub Repository https://openhands.readthedocs.io

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach Thanh Luan Nguyen, Tri Nhu Do, Georges Kaddoum

2 Oct 10, 2022

A concise but complete implementation of CLIP with various experimental improvements from recent papers

x-clip (wip) A concise but complete implementation of CLIP with various experimental improvements from recent papers Install $ pip install x-clip Usag

515 Dec 26, 2022

Parallel Latent Tree-Induction for Faster Sequence Encoding

FastTrees This repository contains the experimental code supporting the FastTrees paper by Bill Pung. Software Requirements Python 3.6, NLTK and PyTor

4 Mar 29, 2022

Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

This repository contains the official code of OSTAR in "Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift" (ICLR 2022).

5 Dec 06, 2022

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Expressive Power of Invariant and Equivaraint Graph Neural Networks In this repository, we show how to use powerful GNN (2-FGNN) to solve a graph alig

36 Dec 12, 2022

[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion

IICNet - Invertible Image Conversion Net Official PyTorch Implementation for IICNet: A Generic Framework for Reversible Image Conversion (ICCV2021). D

55 Dec 06, 2022

A new test set for ImageNet

ImageNetV2 The ImageNetV2 dataset contains new test data for the ImageNet benchmark. This repository provides associated code for assembling and worki

186 Dec 18, 2022

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

606 Dec 21, 2022

Code for "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" @ICRA2021

CloudAAE This is an tensorflow implementation of "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" Files log:

35 Nov 14, 2022

Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

2 Jan 03, 2022

This is a student data management application developed in Python and TKinter. It utilizes the TKinter pillow library to include images to buttons. I've separated TKinter elements into their own individual classes. The user can change the smilely face color for each button individually or by entire row.

Smiley Face Cube Display Table of Contents Project Description Getting Started Prerequisites Installation & Deployment Additional Documentation Projec

0 Aug 04, 2021

A proof of concept ai-powered Recaptcha v2 solver

Recaptcha Fullauto I've decided to open source my old Recaptcha v2 solver. My latest version will be opened sourced this summer. I am hoping this proj

60 Dec 20, 2022

Deep Learning for Human Part Discovery in Images - Chainer implementation

Deep Learning for Human Part Discovery in Images - Chainer implementation NOTE: This is not official implementation. Original paper is Deep Learning f

63 Sep 25, 2022

[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

A paper Introduction This is an official release of the paper Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation wit

14 Dec 08, 2022

League of Legends Reinforcement Learning Environment (LoLRLE) multiple training scenarios using PPO.

League of Legends Reinforcement Learning Environment (LoLRLE) About This repo contains code to train an agent to play league of legends in a distribut

2 Aug 19, 2022

Federated Learning - Including common test models for federated learning, like CNN, Resnet18 and lstm, controlled by different parser

Federated_Learning 💻 This projest include common test models for federated lear

10 Dec 11, 2022

An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

MMGEN-FaceStylor English | 简体中文 Introduction This repo is an efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits

182 Dec 27, 2022