Kalidokit is a blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models

Overview

KalidoKit - Face, Pose, and Hand Tracking Kinematics

Kalidokit Template

Kalidokit is a blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models, compatible with Facemesh, Blazepose, Handpose, and Holistic. It takes predicted 3D landmarks and calculates simple euler rotations and blendshape face values.

As the core to Vtuber web apps, Kalidoface and Kalidoface 3D, KalidoKit is designed specifically for rigging 3D VRM models and Live2D avatars!

Kalidokit Template

ko-fi

Install

Via NPM

npm install kalidokit
import * as Kalidokit from "kalidokit";

// or only import the class you need

import { Face, Pose, Hand } from "kalidokit";

Via CDN

">
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/kalidokit.umd.js"></script>

Methods

Kalidokit is composed of 3 classes for Face, Pose, and Hand calculations. They accept landmark outputs from models like Facemesh, Blazepose, Handpose, and Holistic.

// Accepts an array(468 or 478 with iris tracking) of vectors
Kalidokit.Face.solve(facelandmarkArray, {
    runtime: "tfjs", // `mediapipe` or `tfjs`
    video: HTMLVideoElement,
    imageSize: { height: 0, width: 0 },
    smoothBlink: false, // smooth left and right eye blink delays
    blinkSettings: [0.25, 0.75], // adjust upper and lower bound blink sensitivity
});

// Accepts arrays(33) of Pose keypoints and 3D Pose keypoints
Kalidokit.Pose.solve(poseWorld3DArray, poseLandmarkArray, {
    runtime: "tfjs", // `mediapipe` or `tfjs`
    video: HTMLVideoElement,
    imageSize: { height: 0, width: 0 },
    enableLegs: true,
});

// Accepts array(21) of hand landmark vectors; specify 'Right' or 'Left' side
Kalidokit.Hand.solve(handLandmarkArray, "Right");

// Using exported classes directly
Face.solve(facelandmarkArray);
Pose.solve(poseWorld3DArray, poseLandmarkArray);
Hand.solve(handLandmarkArray, "Right");

Additional Utils

// Stabilizes left/right blink delays + wink by providing blenshapes and head rotation
Kalidokit.Face.stabilizeBlink(
    { r: 0, l: 1 }, // left and right eye blendshape values
    headRotationY, // head rotation in radians
    {
        noWink = false, // disables winking
        maxRot = 0.5 // max head rotation in radians before interpolating obscured eyes
    });

// The internal vector math class
Kalidokit.Vector();

Remixable VRM Template with KalidoKit

Quick-start your Vtuber app with this simple remixable example on Glitch. Face, full-body, and hand tracking in under 350 lines of javascript. This demo uses Mediapipe Holistic for body tracking, Three.js + Three-VRM for rendering models, and KalidoKit for the kinematic calculations. This demo uses a minimal amount of easing to smooth animations, but feel free to make it your own!

Remix on Glitch

Basic Usage

Kalidokit Template

The implementation may vary depending on what pose and face detection model you choose to use, but the principle is still the same. This example uses Mediapipe Holistic which concisely combines them together.

{ await holistic.send({image: HTMLVideoElement}); }, width: 640, height: 480 }); camera.start(); ">
import * as Kalidokit from 'kalidokit'
import '@mediapipe/holistic/holistic';
import '@mediapipe/camera_utils/camera_utils';

let holistic = new Holistic({locateFile: (file) => {
    return `https://cdn.jsdelivr.net/npm/@mediapipe/[email protected]/${file}`;
}});

holistic.onResults(results=>{
    // do something with prediction results
    // landmark names may change depending on TFJS/Mediapipe model version
    let facelm = results.faceLandmarks;
    let poselm = results.poseLandmarks;
    let poselm3D = results.ea;
    let rightHandlm = results.rightHandLandmarks;
    let leftHandlm = results.leftHandLandmarks;

    let faceRig = Kalidokit.Face.solve(facelm,{runtime:'mediapipe',video:HTMLVideoElement})
    let poseRig = Kalidokit.Pose.solve(poselm3d,poselm,{runtime:'mediapipe',video:HTMLVideoElement})
    let rightHandRig = Kalidokit.Hand.solve(rightHandlm,"Right")
    let leftHandRig = Kalidokit.Hand.solve(leftHandlm,"Left")

    };
});

// use Mediapipe's webcam utils to send video to holistic every frame
const camera = new Camera(HTMLVideoElement, {
  onFrame: async () => {
    await holistic.send({image: HTMLVideoElement});
  },
  width: 640,
  height: 480
});
camera.start();

Slight differences with Mediapipe and Tensorflow.js

Due to slight differences in the results from Mediapipe and Tensorflow.js, it is recommended to specify which runtime version you are using as well as the video input/image size as a reference.

Kalidokit.Pose.solve(poselm3D,poselm,{
    runtime:'tfjs', // default is 'mediapipe'
    video: HTMLVideoElement,// specify an html video or manually set image size
    imageSize:{
        width: 640,
        height: 480,
    };
})

Kalidokit.Face.solve(facelm,{
    runtime:'mediapipe', // default is 'tfjs'
    video: HTMLVideoElement,// specify an html video or manually set image size
    imageSize:{
        width: 640,
        height: 480,
    };
})

Outputs

Below are the expected results from KalidoKit solvers.

// Kalidokit.Face.solve()
// Head rotations in radians
// Degrees and normalized rotations also available
{
    eye: {l: 1,r: 1},
    mouth: {
        x: 0,
        y: 0,
        shape: {A:0, E:0, I:0, O:0, U:0}
    },
    head: {
        x: 0,
        y: 0,
        z: 0,
        width: 0.3,
        height: 0.6,
        position: {x: 0.5, y: 0.5, z: 0}
    },
    brow: 0,
    pupil: {x: 0, y: 0}
}
// Kalidokit.Pose.solve()
// Joint rotations in radians, leg calculators are a WIP
{
    RightUpperArm: {x: 0, y: 0, z: -1.25},
    LeftUpperArm: {x: 0, y: 0, z: 1.25},
    RightLowerArm: {x: 0, y: 0, z: 0},
    LeftLowerArm: {x: 0, y: 0, z: 0},
    LeftUpperLeg: {x: 0, y: 0, z: 0},
    RightUpperLeg: {x: 0, y: 0, z: 0},
    RightLowerLeg: {x: 0, y: 0, z: 0},
    LeftLowerLeg: {x: 0, y: 0, z: 0},
    LeftHand: {x: 0, y: 0, z: 0},
    RightHand: {x: 0, y: 0, z: 0},
    Spine: {x: 0, y: 0, z: 0},
    Hips: {
        worldPosition: {x: 0, y: 0, z: 0},
        position: {x: 0, y: 0, z: 0},
        rotation: {x: 0, y: 0, z: 0},
    }
}
// Kalidokit.Hand.solve()
// Joint rotations in radians
// only wrist and thumb have 3 degrees of freedom
// all other finger joints move in the Z axis only
{
    RightWrist: {x: -0.13, y: -0.07, z: -1.04},
    RightRingProximal: {x: 0, y: 0, z: -0.13},
    RightRingIntermediate: {x: 0, y: 0, z: -0.4},
    RightRingDistal: {x: 0, y: 0, z: -0.04},
    RightIndexProximal: {x: 0, y: 0, z: -0.24},
    RightIndexIntermediate: {x: 0, y: 0, z: -0.25},
    RightIndexDistal: {x: 0, y: 0, z: -0.06},
    RightMiddleProximal: {x: 0, y: 0, z: -0.09},
    RightMiddleIntermediate: {x: 0, y: 0, z: -0.44},
    RightMiddleDistal: {x: 0, y: 0, z: -0.06},
    RightThumbProximal: {x: -0.23, y: -0.33, z: -0.12},
    RightThumbIntermediate: {x: -0.2, y: -0.19, z: -0.01},
    RightThumbDistal: {x: -0.2, y: 0.002, z: 0.15},
    RightLittleProximal: {x: 0, y: 0, z: -0.09},
    RightLittleIntermediate: {x: 0, y: 0, z: -0.22},
    RightLittleDistal: {x: 0, y: 0, z: -0.1}
}

Community Showcase

If you'd like to share a creative use of KalidoKit, we would love to hear about it! Feel free to also use our Twitter hashtag, #kalidokit.

Kalidoface virtual webcam Kalidoface Pose Demo

Open to Contributions

The current library is a work in progress and contributions to improve it are very welcome. Our goal is to make character face and pose animation even more accessible to creatives regardless of skill level!

Owner
Rich
Making Vtuber apps with Mediapipe and Tensorflow.js
Rich
Codes for Causal Semantic Generative model (CSG), the model proposed in "Learning Causal Semantic Representation for Out-of-Distribution Prediction" (NeurIPS-21)

Learning Causal Semantic Representation for Out-of-Distribution Prediction This repository is the official implementation of "Learning Causal Semantic

Chang Liu 54 Dec 01, 2022
This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC

LEAP Lab 2 Sep 15, 2022
The project page of paper: Architecture disentanglement for deep neural networks [ICCV 2021, oral]

This is the project page for the paper: Architecture Disentanglement for Deep Neural Networks, Jie Hu, Liujuan Cao, Tong Tong, Ye Qixiang, ShengChuan

Jie Hu 15 Aug 30, 2022
This code is for eCaReNet: explainable Cancer Relapse Prediction Network.

eCaReNet This code is for eCaReNet: explainable Cancer Relapse Prediction Network. (Towards Explainable End-to-End Prostate Cancer Relapse Prediction

Institute of Medical Systems Biology 2 Jul 28, 2022
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
Open source person re-identification library in python

Open-ReID Open-ReID is a lightweight library of person re-identification for research purpose. It aims to provide a uniform interface for different da

Tong Xiao 1.3k Jan 01, 2023
A pre-trained language model for social media text in Spanish

RoBERTuito A pre-trained language model for social media text in Spanish READ THE FULL PAPER Github Repository RoBERTuito is a pre-trained language mo

25 Dec 29, 2022
Simply enable or disable your Nvidia dGPU

EnvyControl (WIP) Simply enable or disable your Nvidia dGPU Usage First clone this repo and install envycontrol with sudo pip install . CLI Turn off y

Victor Bayas 292 Jan 03, 2023
Deep Structured Instance Graph for Distilling Object Detectors (ICCV 2021)

DSIG Deep Structured Instance Graph for Distilling Object Detectors Authors: Yixin Chen, Pengguang Chen, Shu Liu, Liwei Wang, Jiaya Jia. [pdf] [slide]

DV Lab 31 Nov 17, 2022
EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network

EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network This repo contains the official Pytorch implementaion code and conf

Hu Zhang 175 Jan 07, 2023
This is the official implementation of TrivialAugment and a mini-library for the application of multiple image augmentation strategies including RandAugment and TrivialAugment.

Trivial Augment This is the official implementation of TrivialAugment (https://arxiv.org/abs/2103.10158), as was used for the paper. TrivialAugment is

AutoML-Freiburg-Hannover 94 Dec 30, 2022
This is the official implementation for the paper "Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization" in NeurIPS 2021.

MPMAB_BEACON This is code used for the paper "Decentralized Multi-player Multi-armed Bandits: Beyond Linear Reward Functions", Neurips 2021. Requireme

Cong Shen Research Group 0 Oct 26, 2021
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 03, 2023
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
HyperaPy: An automatic hyperparameter optimization framework ⚡🚀

hyperpy HyperPy: An automatic hyperparameter optimization framework Description HyperPy: Library for automatic hyperparameter optimization. Build on t

Sergio Mora 7 Sep 06, 2022
General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

QData 154 Dec 21, 2022
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
Prototype for Baby Action Detection and Classification

Baby Action Detection Table of Contents About Install Run Predictions Demo About An attempt to harness the power of Deep Learning to come up with a so

Shreyas K 30 Dec 16, 2022
Face Detection & Age Gender & Expression & Recognition

Face Detection & Age Gender & Expression & Recognition

Sajjad Ayobi 188 Dec 28, 2022
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022