A learning-based data collection tool for human segmentation

Last update: Jun 24, 2022

Overview

FullBodyFilter

A Learning-Based Data Collection Tool For Human Segmentation

Overview

Human segmentation is a difficult machine learning task of identifying and extracting the human in a picture. Most of the time this is done by using a convolutional neural network. In order to achieve an accurate and robust model, large amounts of data with varying human poses need to be collected to train the model. Collecting and labeling train data by hand takes lots of time and resources. This project explores another option to use automtation to collect and label pre-existing data from internet videos.

The model that was focused on is the DTEN ME model used for Zoom meetings virtual background.

Openpose is used to filter the video for suitable frames, in particular single person full body frames. Mask R-CNN is the teacher model that generates training labels. To find which images perform poorly on ME model, a comparison is done between ME masks and Mask R-CNN masks. The result is a set of images and masks that can be used as training data.

Overview of Program

A full report of the system design and implemenation details can be found in doc

Sample Results

Examples of train data saved. In each image bottom left is Mask R-CNN mask and bottom right is ME mask.

Usage

This project relies on Openpose and Mask R-CNN and all their dependencies. Instructions on how to set up each are found in there respective directories here.

Documentation on how to use scripts are located in doc.

A learning-based data collection tool for human segmentation

Related tags

Overview

FullBodyFilter

Contents

Overview

Sample Results

Usage

Owner

Robert Jiang

Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Immortal tracker

Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

Practical and Real-world applications of ML based on the homework of Hung-yi Lee Machine Learning Course 2021

領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

Dynamical Wasserstein Barycenters for Time Series Modeling

mlpack: a scalable C++ machine learning library --

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

Object detection using yolo-tiny model and opencv used as backend

deep learning model with only python and numpy with test accuracy 99 % on mnist dataset and different optimization choices

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples"

Official implementation of "Watermarking Images in Self-Supervised Latent-Spaces"