Use Python, OpenCV, and MediaPipe to control a keyboard with facial gestures

Last update: Nov 09, 2022

Overview

CheekyKeys

A Face-Computer Interface

CheekyKeys lets you control your keyboard using your face.

View a fuller demo and more background on the project at https://youtu.be/rZ0DBi1avMM

CheekyKeys uses OpenCV and MediaPipe's Face Mesh to perform real-time detection of facial landmarks from video input. From there, relative differences are calculated to determine specific facial gestures and translate those into commands sent via keyboard.

This version 0.1 is hardcoded to my facial features, but thresholds can easily be modified. It's also built for a Mac keyboard, but you can also swap i.e. Windows key for Command simply enough.

The primary input is to "type" letters, digits, and symbols via Morse code by opening and closing your mouth quickly for . and slightly longer for -. Rather than waiting a set time after every letter, you scrunch your mouth upward once to finish a letter, or twice to add a space (end a word). Three mouth scrunches types enter/return.

The cheatsheet includes the full alphabet as well as special characters and hotkeys.

Most of the rest of the keyboard and other helpful actions are included as modifier gestures, such as:

shift: close right eye
command: close left eye
arrow up/down: raise left/right eyebrow
arrow left/right: raise left/right eyebrow + duckface (pursed lips)
backspace: duckface + double blink
zoom in: eyes bulge
zoom out: eyes squint
repeat previous letter/command: double raise of both eyebrows
clear current Morse queue: wink right eye, then wink left eye
escape: wink left eye, then wink right eye

Use Python, OpenCV, and MediaPipe to control a keyboard with facial gestures

Related tags

Overview

CheekyKeys

A Face-Computer Interface

Owner

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

Ensemble Visual-Inertial Odometry (EnVIO)

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Multilingual Image Captioning

Progressive Coordinate Transforms for Monocular 3D Object Detection

FB-tCNN for SSVEP Recognition

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Code for ICMI2020 and ICMI2021 papers: "Studying Person-Specific Pointing and Gaze Behavior for Multimodal Referencing of Outside Objects from a Moving Vehicle" and "ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion Approach for Referencing Outside Objects From a Moving Vehicle"

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

LAnguage Model Analysis

Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

SBINN: Systems-biology informed neural network

MonoRCNN is a monocular 3D object detection method for automonous driving

Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

Record radiologists' eye gaze when they are labeling images.

A module that used for encrypt code which includes RSA and AES

Pre-trained NFNets with 99% of the accuracy of the official paper

Code to reproduce the results for Compositional Attention

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations