Unimodal Face Classification with Multimodal Training

This is a PyTorch implementation of the following paper:

Unimodal Face Classification with Multimodal Training

Wenbin Teng (Boston University), Chongyang Bai (Dartmouth College)

Abstract: We propose a Multimodal Training Unimodal Test (MTUT) framework for robust face classification, which exploits the cross-modality relationship during training and applies it as a complementary of the imperfect single modality input during testing. Technically, during training, the framework (1) builds both intra-modality and cross-modality autoencoders with the aid of facial attributes to learn latent embeddings as multimodal descriptors, (2) proposes a novel multimodal embedding divergence loss to align the heterogeneous features from different modalities, which also adaptively avoids the useless modality (if any) from confusing the model. This way, the learned autoencoders can generate robust embeddings in single-modality face classification on test stage. We evaluate our framework in two face classification datasets and two kinds of testing input: (1) poor-condition image and (2) point cloud or 3D face mesh, when both 2D and 3D modalities are available for training.

The proposed method applies both 2D and 3D encoder to extract the embeddings of each individual modalities. Divergence between both embeddings is minimized adaptively through measuring the classification loss. Based on the type of testing modality, we use certain decoder to reconstruct 2D and 3D inputs from feature embeddings. An overview of the proposed network is shown in the following picture:

Unimodal Face Classification with Multimodal Training

Related tags

Overview

Unimodal Face Classification with Multimodal Training

Owner

Wenbin Teng

unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier"

Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)

Official implementation for “Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior”

Tensors and neural networks in Haskell

smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

Pytorch modules for paralel models with same architecture. Ideal for multi agent-based systems

✨✨✨An awesome open source toolbox for stereo matching.

BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

An Inverse Kinematics library aiming performance and modularity

YOLOX Win10 Project

State of the Art Neural Networks for Deep Learning

🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

A deep learning framework for historical document image analysis

[Link]deep_portfolo - Use Reforcemet earg ad Supervsed learg to Optmze portfolo allocato []

Machine Translation Implement By Bi-GRU And Transformer

Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale