GBIM(Gesture-Based Interaction map)

Overview

GBIM

Python 3.6 PaddleX License

手势交互地图 GBIM(Gesture-Based Interaction map),基于视觉深度神经网络的交互地图,通过电脑摄像头观察使用者的手势变化,进而控制地图进行简单的交互。网络使用PaddleX提供的轻量级模型PPYOLO Tiny以及MobileNet V3 small,使得整个模型大小约10MB左右,即使在CPU下也能快速定位和识别手势。

手势

手势 交互 手势 交互 手势 交互
向上滑动 向左滑动 地图放大
手势 交互 手势 交互 手势 交互
向下滑动 向右滑动 地图缩小

进度安排

基础

  • 确认用于交互的手势。
  • 使用det_acq.py采集一些电脑摄像头拍摄的人手姿势数据。
  • 数据标注,训练手的目标检测模型
  • 捕获目标手,使用clas_acq.py获取手部图像进行标注,并用于训练手势分类模型。
  • 交互手势的检测与识别组合验证。
  • 打开百度地图网页版,进行模拟按键交互。
  • 组合功能,验证基本功能。

进阶

  • 将图像分类改为序列图像分类,提高手势识别的流畅度和准确度。
  • 重新采集和标注数据,调参训练模型。
  • 搭建可用于参数调节的地图。
  • 界面整合,整理及美化。

数据集 & 模型

手势检测

  • 数据集使用来自联想小新笔记本摄像头采集的数据,使用labelImg标注为VOC格式,共1011张。该数据集场景、环境和人物单一,仅作为测试使用,不提供数据集下载。数据组织参考PaddelX下的PascalVOC数据组织方式。
  • 模型使用超轻量级PPYOLO Tiny,模型大小小于4MB,随便训练了100轮后保留best_model作为测试模型,由于数据集和未调参训练的原因,当前默认识别效果较差

手势分类

  • 数据集使用来自联想小新笔记本摄像头采集的数据,通过手势检测模型提出出手图像,人工分为7类,分别为6种交互手势以及“其他”,共1102张。该数据集数量较少,手型及手势单一,仅作为测试使用,不提供数据集下载。数据组织形式如下:
dataset
	├-- Images
	|     ├-- up
	┆     ┆    └-- xxx.jpg
	|     └-- other
	┆          └-- xxx.jpg
	├-- labels.txt
	├-- train_list.txt
	└-- val_list.txt
  • 模型使用超轻量级MobileNet V3 small,模型大小小于7MB,由于数据量很小,随便训练了20轮后保留best_model作为测试模型,当前识别分类效果较差

模型文件上传使用LFS,下拉时注意需要安装LFS,参考LFS文档。后续将重新采集和标注更加多样的大量数据集,并采用更好的调参方法获得更加准确的识别模型

演示

手势识别

地图交互

*未显示Capture界面

使用

  1. 克隆当前项目到本地,按照requirements.txt安装所依赖的包opencv、paddlex以及pynput。PaddleX对应请安装最新版的PaddlePaddle,由于模型轻量,CPU版本足矣,参考下面代码,细节参考官方网站
python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
  1. 进入demo.py,将浏览器路径修改为自己使用的浏览器路径:
web_path = '"D:/Twinkstar/Twinkstar Browser/twinkstar.exe"'  # 自己的浏览器路径
  1. 运行demo.py启动程序:
cd GBIM
python demo.py

常见问题及解决

  1. Q: 拉项目时卡住不动

    A:首先确认按照文档安装LFS。如果已经安装那极大可能是网络问题,可以等待一段时间,或先跳过LFS文件,再单独拉取,参考下面git代码:

    // 开启跳过无法clone的LFS文件
    git lfs install --skip-smudge 
    // clone当前项目
    git clone "current project" 
    // 进入当前项目,单独拉取LFS文件
    cd "current project" 
    git lfs pull 
    // 恢复LFS设置
    git lfs install --force
  2. Q:按q或者手势交互无效

    A:请注意当前鼠标点击的焦点,焦点在Capture,则接受q退出;焦点在浏览器,则交互结果将驱动浏览器中的地图进行变换。

  3. Q:安装PaddleX时报错,关于MV C++

    A:若在Windows下安装coco tool时报错,则可能缺少Microsoft Visual C++,可在微软官方下载网页进行下载安装后重启,即可解决。

  4. Q:运行未报错,但没有保存数据到本地

    A:请检查路径是否有中文,cv2.imwrite保存图像时不能有中文路径。

参考

  1. 玩腻了小游戏?Paddle手势识别玩转游戏玩出新花样!
  2. https://github.com/PaddlePaddle/PaddleX

交流与反馈

Email:[email protected]

RL Algorithms with examples in Python / Pytorch / Unity ML agents

Reinforcement Learning Project This project was created to make it easier to get started with Reinforcement Learning. It now contains: An implementati

Rogier Wachters 3 Aug 19, 2022
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis [arxiv|pdf|v

Yinan He 78 Dec 22, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
PyTorch implementation of the WarpedGANSpace: Finding non-linear RBF paths in GAN latent space (ICCV 2021)

Authors official PyTorch implementation of the "WarpedGANSpace: Finding non-linear RBF paths in GAN latent space" [ICCV 2021].

Christos Tzelepis 100 Dec 06, 2022
Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021

LPTN Paper | Supplementary Material | Poster High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network Ji

372 Dec 26, 2022
ICCV2021 Papers with Code

ICCV2021 Papers with Code

Amusi 1.4k Jan 02, 2023
Repository for MDPGT

MD-PGT Repository for implementing and reproducing the results for the paper MDPGT: Momentum-based Decentralized Policy Gradient Tracking. Available E

Xian Yeow Lee 2 Dec 30, 2021
VIsually-Pivoted Audio and(N) Text

VIP-ANT: VIsually-Pivoted Audio and(N) Text Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowled

Yän.PnG 16 Nov 04, 2022
Python Fanduel API (2021) - Lineup Automation

Southpaw is a python package that provides access to the Fanduel API. Optimize your DFS experience by programmatically updating your lineups, analyzin

Brandin Canfield 13 Jan 04, 2023
Torchyolo - Yolov3 ve Yolov4 modellerin Pytorch uygulamasıdır

TORCHYOLO : Yolo Modellerin Pytorch Uygulaması Yapılacaklar: Yolov3 model.py ve

Kadir Nar 3 Aug 22, 2022
This repository lets you interact with Lean through a REPL.

lean-gym This repository lets you interact with Lean through a REPL. See Formal Mathematics Statement Curriculum Learning for a presentation of lean-g

OpenAI 87 Dec 28, 2022
My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

machine-learning-with-graphs My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs Course materials can be

Marko Njegomir 7 Dec 14, 2022
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

230 Dec 07, 2022
A collection of SOTA Image Classification Models in PyTorch

A collection of SOTA Image Classification Models in PyTorch

sithu3 85 Dec 30, 2022
Non-Imaging Transient Reconstruction And TEmporal Search (NITRATES)

Non-Imaging Transient Reconstruction And TEmporal Search (NITRATES) This repo contains the full NITRATES pipeline for maximum likelihood-driven discov

13 Nov 08, 2022
An All-MLP solution for Vision, from Google AI

MLP Mixer - Pytorch An All-MLP solution for Vision, from Google AI, in Pytorch. No convolutions nor attention needed! Yannic Kilcher video Install $ p

Phil Wang 784 Jan 06, 2023
《Truly shift-invariant convolutional neural networks》(2021)

Truly shift-invariant convolutional neural networks [Paper] Authors: Anadi Chaman and Ivan Dokmanić Convolutional neural networks were always assumed

Anadi Chaman 46 Dec 19, 2022
EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers This repository contains the code to our paper published at ICCV 2021. For ques

Facebook Research 62 Dec 14, 2022
NALSM: Neuron-Astrocyte Liquid State Machine

NALSM: Neuron-Astrocyte Liquid State Machine This package is a Tensorflow implementation of the Neuron-Astrocyte Liquid State Machine (NALSM) that int

Computational Brain Lab 4 Nov 28, 2022