A novel Engagement Detection with Multi-Task Training (ED-MTT) system

Last update: Nov 11, 2022

Related tags

Overview

ED-MTT

A novel Engagement Detection with Multi-Task Training (ED-MTT) system which minimizes MSE and triplet loss together to determine the engagement level of students in an e-learning environment. You can check the colab notebook bellow for detailed explanatoins about data loading and code execution.

Introduction & Problem Definition

With the Covid-19 outbreak, the online working and learning environments became essential in our lives. For this reason, automatic analysis of non-verbal communication becomes crucial in online environments.

Engagement level is a type of social signal that can be predicted from facial expression and body pose. To this end, we propose an end-to-end deep learning-based system that detects the engagement level of the subject in an e-learning environment.

The engagement level feedback is important because:

Make aware students of their performance in classes.
Will help instructors to detect confusing or unclear parts of the teaching material.

Model Architecture

The proposed system first extracts features with OpenFace, then aggregates frames in a window for calculating feature statistics as additional features. Finally, uses Bi-LSTM for generating vector embeddings from input sequences. In this system, we introduce a triplet loss as an auxiliary task and design the system as a multi-task training framework by taking inspiration from, where self-supervised contrastive learning of multi-view facial expressions was introduced. To the best of our knowledge, this is a novel approach in engagement detection literature. The key novelty of this work is the multi-task training framework using triplet loss together with Mean Squared Error (MSE). The main contributions of this paper are as follows:

Multi-task training with triplet and MSE losses introduces an additional regularization and reduces over-fitting due to very small sample size.
Using triplet loss mitigates the label reliability problem since it measures relative similarity between samples.
A system with lightweight feature extraction is efficient and highly suitable for real-life applications.

Dataset

We evaluate the performance of ED-MTT on a publicly available ``Engagement in The Wild'' dataset which is comprised of separated training and validation sets.

The dataset is comprised of 78 subjects (25 females and 53 males) whose ages are ranged from 19 to 27. Each subject is recorded while watching an approximately 5 minutes long stimulus video of a Korean Language lecture.

Results

We compare the performance of ED-MTT with 9 different works from the state-of-the-art which will be reviewed in the rest of this section. Our results show that ED-MTT outperforms these state-of-the-art methods with at least a 5.74% improvement on MSE.

Repository structure

ED-MTT
│   README.md
│   Engagement_Labels.txt
|   ED-MTT.ipynb

└───code
│   │   dataloader.py
|   |   model.py
|   |   train.py
|   |   test.py
│   │   fix_path.py
|   |   utils.py
|   |   requirements.txt

└───configs
    │   batchnorm_default.yaml
    │   sweep.yaml

Running the Code

To train the experiments and manage the experiments, we used PyTorch Lightning together with Weights&Biases. All the detailed explonations to;

Load data and pre-trained weights,
Train the model from scratch,
Manage expriments and hyper-parameter search with wandb,
Reproduce the results presented in the paper,

are shown in ED-MTT.ipynb colab notebook.

A novel Engagement Detection with Multi-Task Training (ED-MTT) system

Related tags

Overview

ED-MTT

Introduction & Problem Definition

Model Architecture

Dataset

Results

Repository structure

Running the Code

Owner

Onur Çopur

Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

🥇Samsung AI Challenge 2021 1등 솔루션입니다🥇

This is the source code for the experiments related to the paper Unsupervised Audio Source Separation Using Differentiable Parametric Source Models

Keras implementation of Deeplab v3+ with pretrained weights

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Simulation of Self Driving Car

Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

A graph adversarial learning toolbox based on PyTorch and DGL.

Deep Face Recognition in PyTorch