CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

Overview

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

  In order to facilitate the research of multi-modal sensor fusion for human action recognition, this paper provides a multi-modal human action dataset using Kinect depth camera and multile wearable sensors, which is called Changzhou University multi-modal human action dataset (CZU-MHAD). Our dataset contains more wearable sensors, which aims to obtain the position data of human skeleton joints, as well as 3-axis acceleration and 3-axis angular velocity data of corresponding joints. Our dataset provides time synchronous depth video, skeleton joint position, 3-axis acceleration and 3-axis angular velocity data to describe a complete human action.

1. Sensors

  The CZU-MHAD uses 1 Microsoft Kinect V2 and 10 wearable sensors MPU9250. These two kinds of sensors are widely used, which have the characteristics of low power consumption, low cost and simple operation. In addition, it does not require too much computing power to process the data collected by the two kind sensors in real time.

1.1 Kinect v2

  The above picture is the Microsoft Kinect V2, which can collect both color and depth images at a sampling frequency of 30 frames per second. Kinect SDK is a software package provided by Microsoft, which can be used to track 25 skeleton joint points and their 3D spatial positions. You can download the Kinect SDK in https://www.microsoft.com/en-us/download/details.aspx?id=44561.

  The above image shows 25 skeleton joint points of the human body that Kinect V2 can track.

1.2 MPU9250

  The MPU9250 can capture 3-axis acceleration, 3-axis angular velocity and 3-axis magnetic intensity.

  • The measurement range of MPU9250:
    • the measurement range of accelerometer is ±16g;
    • the measurement range of angular velocity of the gyroscope is ±2000 degrees/second.

  CZU-MHAD uses Raspberry PI to interact with MPU9250 through the integrated circuit bus (IIC) interface, realizing the functions of reading, saving and uploading MPU9250 sensor data to the server.The connection between Raspberry PI and MPU9250 is shown in picture.

  You can visit https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up to learn more about Raspberry PI.

2. Data Acquisition System Architecture

  This section introduces the data acquisition system of CZU-MHAD dataset. CZU-MHAD uses Kinect V2 sensor to collect depth image and joint position data, and uses MPU9250 sensor to collect 3-axis acceleration data and 3-axis angular velocity data. In order to collect the 3-axis acceleration data and the 3-axis angular velocity data of the whole body, a motion data acquisition system including 10 MPU9250 sensors is built-in this paper. The sampling system architecture is shown in following picture.

  The MPU9250 sensor is controlled by Raspberry PI, Kinect V2 is controlled by a notebook computer, and time synchronization with a NTP server is carried out every time data is collected. After considering the sampling scheme of MHAD and UTD-MHAD, the position of wearable sensors is determined as shown in the following picture.

  The points marked in red in the figure are the positions of inertial sensors, the left in the figure is the left side of the human body, and the right in the figure is the right side of the human body.

3. Information for "CZU-MHAD" dataset.

  The CZU-MHAD dataset contains 22 actions performed by 5 subjects (5 males). Each subject repeated each action >8 times. The CZU-MHAD dataset contains a total of >880 samples. The 22 actions performed are listed in Table. It can be seen that CZU-MHAD includes common gestures (such as Draw fork, Draw circle),daily activities (such as Sur Place, Clap, Bend down), and training actions (such as Left body turning movement, Left lateral movement).

Describe different actions in English:

ID Action name ID Action name ID Action name ID Action name
1 Right high wave 7 Draw fork with right hand 13 Right foot kick side 19 Left body turning movement
2 Left high wave 8 Draw fork with left hand 14 Left foot kick side 20 Right body turning movement
3 Right horizontal wave 9 Draw circle with right hand 15 Clap 21 Left lateral movement
4 Left horizontal wave 10 Draw circle with left hand 16 Bend down 22 Right lateral movement
5 Hammer with right hand 11 Right foot kick foward 17 Wave up and down
6 Grasp with right hand 12 Left foot kick foward 18 Sur Place

Describe different actions in Chinese::

ID Action name ID Action name ID Action name ID Action name
1 右高挥手 7 右手画× 13 右脚侧踢 19 左体转
2 左高挥手 8 左手画× 14 左脚侧踢 20 右体转
3 右水平挥手 9 右手画○ 15 拍手 21 左体侧
4 左水平挥手 10 左手画○ 16 弯腰 22 右体侧
5 锤(右手) 11 右脚前踢 17 上下挥手
6 抓(右手) 12 左脚前踢 18 原地踏步

4. How to download the dataset

   We offer one way to download our CZU-MHAD dataset:

  1. BaiduDisk(百度网盘)

    (Link) 链接:https://pan.baidu.com/s/1SBy0D2f1ZoX_mDyd3YEp2Q
    (Code) 提取码:qsq1

  In the CZU-MHAD, you will see three subfolders:

  • depth_mat

       The depth_mat contains the depth images captured by Kinect V2. In this folder, each file represents an action sample. Each file is named by the subject's name, the category label of the action and the time of each action of each subject. Take cyy_a1_t1.mat as an example, cyy is the subject's name, a1 is the name of the action, t1 stands the first time to perform this action. How to read data is shown in our sample code.

  • sensors_mat

       The sensors_mat contains the data of 3-axis acceleration and 3-axis angular velocity captured by MPU9250. In this folder, each file represents an action sample. Each file is named by the subject's name, the category label of the action and the time of each action of each subject. Take cyy_a1_t1.mat as an example, cyy is the subject's name, a1 is the name of the action, t1 stands the first time to perform this action. How to read data is shown in our sample code.

  • skeleton_mat

       The skeleton_mat contains the position data of skeleton joint points captured by Kinect V2. In this folder, each file represents an action sample. Each file is named by the subject's name, the category label of the action and the time of each action of each subject. Take cyy_a1_t1.mat as an example, cyy is the subject's name, a1 is the name of the action, t1 stands the first time to perform this action. How to read data is shown in our sample code.

5. Sample codes

  1. BaiduDisk(百度网盘)

    (Link) 链接:https://pan.baidu.com/s/1bWq7ypygjTffkor1GAExMQ

    (Code) 提取码:limf

6. Citation

To use our dataset, please refer to the following paper:

  • Mo Yujian, Hou Zhenjie, Chang Xingzhi, Liang Jiuzhen, Chen Chen, Huan Juan. Structural feature representation and fusion of behavior recognition oriented human spatial cooperative motion[J]. Journal of Beijing University of Aeronautics and Astronautics,2019,(12):2495-2505.

7. Mailing List

  If you are interested to recieve news, updates, and future events about this dataset, please email me.

#. Thanks(致谢)

  1. Cui Yaoyao(崔瑶瑶)
  2. Chao Xin(巢新)
  3. Qin Yinhua(秦银华)
  4. Zhang Yuheng(张宇恒)
  5. Mo Yujian(莫宇剑)

#. Gao Liang(高亮)

#. Shi Yuhang(石宇航)

  The subjects marked with '#' also participated in our data collection process. However, due to the unstable power supply and abnormal heat dissipation of Raspberry PI, their behavior data is abnormal. Therefore, we do not provide their data.

You might also like...
Official PyTorch implementation of
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

TFLite-msg_chn_wacv20-depth-completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

"NTU RGB+D" Action Recognition Dataset "NTU RGB+D 120" Action Recognition Dataset "NTU RGB+D" is a large-scale dataset for human action recognition. I

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping
LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

LVI-SAM This repository contains code for a lidar-visual-inertial odometry and mapping system, which combines the advantages of LIO-SAM and Vins-Mono

A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

 COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

Releases(skeleton)
Owner
yujmo
帅气,阳光,灿烂,美丽,大方
yujmo
Le dataset des images du projet d'IA de 2021

face-mask-dataset-ilc-2021 Le dataset des images du projet d'IA de 2021, Indiquez vos id git dans la issue pour les droits TL;DR: Choisir 200 images J

7 Nov 15, 2021
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
hySLAM is a hybrid SLAM/SfM system designed for mapping

HySLAM Overview hySLAM is a hybrid SLAM/SfM system designed for mapping. The system is based on ORB-SLAM2 with some modifications and refactoring. Raú

Brian Hopkinson 15 Oct 10, 2022
Exploring Visual Engagement Signals for Representation Learning

Exploring Visual Engagement Signals for Representation Learning Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie and Ser-Nam Lim C

Menglin Jia 9 Jul 23, 2022
BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

Barcode Rattler A Raspberry Pi Powered Barcode Reader to load a game on the Mist

Chrissy 29 Oct 31, 2022
Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

MAUVE MAUVE is a library built on PyTorch and HuggingFace Transformers to measure the gap between neural text and human text with the eponymous MAUVE

Krishna Pillutla 182 Jan 02, 2023
A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

jie jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model

Bojing Jia 9 Sep 29, 2022
Mixed Neural Likelihood Estimation for models of decision-making

Mixed neural likelihood estimation for models of decision-making Mixed neural likelihood estimation (MNLE) enables Bayesian parameter inference for mo

mackelab 9 Dec 22, 2022
Keras-1D-ACGAN-Data-Augmentation

Keras-1D-ACGAN-Data-Augmentation What is the ACGAN(Auxiliary Classifier GANs) ? Related Paper : [Abstract : Synthesizing high resolution photorealisti

Jae-Hoon Shim 7 Dec 23, 2022
Simulation of moving particles under microscopic imaging

Simulation of moving particles under microscopic imaging Install scipy numpy scikit-image tiffile Run python simulation.py Read result https://imagej

Zehao Wang 2 Dec 14, 2021
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences This repository is an official PyTorch implementation of Neighbor

DIVE Lab, Texas A&M University 8 Jun 12, 2022
Asterisk is a framework to generate high-quality training datasets at scale

Asterisk is a framework to generate high-quality training datasets at scale

Mona Nashaat 44 Apr 25, 2022
Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

FLAME Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation, accepted at the 17th IEEE Internation Co

Neelabh Sinha 19 Dec 17, 2022
Automatic Number Plate Recognition using Contours and Convolution Neural Networks (CNN)

Cite our paper if you find this project useful https://www.ijariit.com/manuscripts/v7i4/V7I4-1139.pdf Abstract Image processing technology is used in

Adithya M 2 Jun 28, 2022
A tight inclusion function for continuous collision detection

Tight-Inclusion Continuous Collision Detection A conservative Continuous Collision Detection (CCD) method with support for minimum separation. You can

Continuous Collision Detection 89 Jan 01, 2023
Robust & Reliable Route Recommendation on Road Networks

NeuroMLR: Robust & Reliable Route Recommendation on Road Networks This repository is the official implementation of NeuroMLR: Robust & Reliable Route

4 Dec 20, 2022
Object detection GUI based on PaddleDetection

PP-Tracking GUI界面测试版 本项目是基于飞桨开源的实时跟踪系统PP-Tracking开发的可视化界面 在PaddlePaddle中加入pyqt进行GUI页面研发,可使得整个训练过程可视化,并通过GUI界面进行调参,模型预测,视频输出等,通过多种类型的识别,简化整体预测流程。 GUI界面

杨毓栋 68 Jan 02, 2023
領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

image-capture-class-annotation 領域を指定し、キーを入力することで画像を保存するツールです。 クラス分類用のデータセット作成を想定しています。 Requirement OpenCV 3.4.2 or later Usage 実行方法は以下です。 起動後はマウスクリック4

KazuhitoTakahashi 5 May 28, 2021
WTTE-RNN a framework for churn and time to event prediction

WTTE-RNN Weibull Time To Event Recurrent Neural Network A less hacky machine-learning framework for churn- and time to event prediction. Forecasting p

Egil Martinsson 727 Dec 28, 2022
The fastai deep learning library

Welcome to fastai fastai simplifies training fast and accurate neural nets using modern best practices Important: This documentation covers fastai v2,

fast.ai 23.2k Jan 07, 2023