Gesture-Detection-and-Depth-Estimation

This is my graduation project.

(1) In this project, I use the YOLOv3 object detection model to detect gesture in RGB image. I trained the model on the self-made gesture dataset to obtain the gesture detection model based on deep learning. Then by testing the model on the test dataset, I found that the model can meet the requirements of real-time gesture detection while maintaining high accuracy.

(2) Then I tried to use the monocular depth estimation algorithm based on depth learning to estimate the depth of gesture object from a single RGB image, including FastDepth algorithm and the improved detection model based on YOLOv3. The FastDepth algorithm is trained and tested on the self-made gesture-depth dataset. Then, by adding a depth vector to output dimensions and modifying the loss function, the function of estimating target depth is added to the YOLOv3 model. Then I trained and tested the modified YOLOv3 model on the same gesture-depth dataset. Finally, the experiment results show that both methods can estimate the depth information of gesture object in RGB image to a certain extent.

Gesture detection:

Depth data:

Estimate target depth：

(3) Also, I developed a simple program with PyOpenGL that can use gesture information to draw simple shapes in three-dimensional space.

Try to draw a cube:

For more information, you can check my final paper.

YOLOv3 model is based on coldlarry's model: https://github.com/coldlarry/YOLOv3-complete-pruning

Graduation Project

Related tags

Overview

Gesture-Detection-and-Depth-Estimation

Owner

ChaosAT

Neural implicit reconstruction experiments for the Vector Neuron paper

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Modular Probabilistic Programming on MXNet

METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)

Scrutinizing XAI with linear ground-truth data

Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

FluidNet re-written with ATen tensor lib

StorSeismic: An approach to pre-train a neural network to store seismic data features

This is the code for the paper "Motion-Focused Contrastive Learning of Video Representations" (ICCV'21).

This is the official github repository of the Met dataset

A Simplied Framework of GAN Inversion

The dynamics of representation learning in shallow, non-linear autoencoders

a reimplementation of Holistically-Nested Edge Detection in PyTorch

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

This repository contains the code for the binaural-detection model used in the publication arXiv:2111.04637

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

This repository introduces a short project about Transfer Learning for Classification of MRI Images.

MNIST, but with Bezier curves instead of pixels

Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)