Rethinking Portrait Matting with Privacy Preserving

Overview

Rethinking Portrait Matting with Privacy Preserving

This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

Sihan Ma, Jizhizi Li, Jing Zhang, He Zhang, and Dacheng Tao

Introduction | PPT and P3M-10k | P3M-Net | P3M-CP | Results | Inference code | Statement


📮 News

[2022-03-31]: Publish the inference code and the pretrained model (Google Drive | Baidu Wangpan (pw: hxxy)) that can be used to test with our SOTA model P3M-Net(ViTAE-S) on your own privacy-preserving or normal portrait images.

[2021-12-06]: Publish the P3M-10k dataset.

[2021-11-21]: Publish the conference paper ACM MM 2021 "Privacy-preserving Portrait Matting". The code and data are available at github repo.

Other applications of ViTAE Transformer include: image classification | object detection | semantic segmentation | animal pose segmentation | remote sensing

Introduction

Recently, there has been an increasing concern about the privacy issue raised by using personally identifiable information in machine learning. However, previous portrait matting methods were all based on identifiable portrait images.

To fill the gap, we present P3M-10k in this paper, which is the first large-scale anonymized benchmark for Privacy-Preserving Portrait Matting. P3M-10k consists of 10,000 high-resolution face-blurred portrait images along with high-quality alpha mattes. We systematically evaluate both trimap-free and trimap-based matting methods on P3M-10k and find that existing matting methods show different generalization capabilities when following the Privacy-Preserving Training (PPT) setting, 𝑖.𝑒., training on face-blurred images and testing on arbitrary images.

To devise a better trimap-free portrait matting model, we propose P3M-Net, consisting of three carefully designed integration modules that can perform privacy-insensitive semantic perception and detail-reserved matting simultaneously. We further design multiple variants of P3MNet with different CNN and transformer backbones and identify the difference of their generalization abilities.

To further mitigate this issue, we devise a simple yet effective Copy and Paste strategy (P3M-CP) that can borrow facial information from public celebrity images without privacy concerns and direct the network to reacquire the face context at both data and feature level. P3M-CP only brings a few additional computations during training, while enabling the matting model to process both face-blurred and normal images without extra effort during inference.

Extensive experiments on P3M-10k demonstrate the superiority of P3M-Net over state-of-the-art methods and the effectiveness of P3MCP in improving the generalization ability of P3M-Net, implying a great significance of P3M for future research and real-world applications.

PPT Setting and P3M-10k Dataset

PPT Setting: Due to the privacy concern, we propose the Privacy-Preserving Training (PPT) setting in portrait matting, 𝑖.𝑒., training on privacy-preserved images (𝑒.𝑔., processed by face obfuscation) and testing on arbitraty images with or without privacy content. As an initial step towards privacy-preserving portrait matting problem, we only define the identifiable faces in frontal and some profile portrait images as the private content in this work.

P3M-10k Dataset: To further explore the effect of PPT setting, we establish the first large-scale privacy-preserving portrait matting benchmark named P3M-10k. It contains 10,000 annonymized high-resolution portrait images by face obfuscation along with high-quality ground truth alpha mattes. Specifically, we carefully collect, filter, and annotate about 10,000 high-resolution images from the Internet with free use license. There are 9,421 images in the training set and 500 images in the test set, denoted as P3M-500-P. In addition, we also collect and annotate another 500 public celebrity images from the Internet without face obfuscation, to evaluate the performance of matting models under the PPT setting on normal portrait images, denoted as P3M-500-NP. We show some examples as below, where (a) is from the training set, (b) is from P3M-500-P, and (c) is from P3M-500-NP.

P3M-10k and the facemask are now published!! You can get access to it from the following links, please make sure that you have read and agreed to the agreement. Note that the facemask is not used in our work. So it's optional to download it.

Dataset

Dataset Link
(Google Drive)

Dataset Link
(Baidu Wangpan 百度网盘)

Dataset Release Agreement
P3M-10k Link Link (pw: fgmc) Agreement (MIT License)
P3M-10k facemask (optional) Link Link (pw: f772) Agreement (MIT License)

P3M-Net and Variants

Our P3M-Net network models the comprehensive interactions between the sharing encoder and two decoders through three carefully designed integration modules, i.e., 1) a tripartite-feature integration (TFI) module to enable the interaction between encoder and two decoders; 2) a deep bipartite-feature integration (dBFI) module to enhance the interaction between the encoder and segmentation decoder; and 3) a shallow bipartitefeature integration (sBFI) module to promote the interaction between the encoder and matting decoder.

We design three variants of P3M Basic Blocks based on CNN and vision transformers, namely P3M-Net(ResNet-34), P3M-Net(Swin-T), P3M-Net(ViTAE-S). We leverage the ability of transformers in modeling long-range dependency to extract more accurate global information and the locality modelling ability to reserve lots of details in the transition areas. The structures are shown in the following figures.

Here we provide the P3M-Net(ViTAE-S) model we pretrained on P3M-10k.

Model Google Drive Baidu Wangpan(百度网盘)
P3M-Net(ViTAE-S) Link Link (pw: hxxy)

P3M-CP

To further improve the generalization ability of P3M-Net, we devise a simple yet effective Copy and Paste strategy (P3M-CP) that can borrow facial information from publicly available celebrity images without privacy concerns and guide the network to reacquire the face context at both data and feature level, namely P3M-ICP and P3M-FCP. The pipeline is shown in the following figure.

Results

We test our network on our proposed P3M-500-P and P3M-500-NP and compare with previous SOTA methods, we list the results as below.

Inference Code - How to Test on Your Images

Here we provide the procedure of testing on sample images by our pretrained P3M-Net(ViTAE-S) model:

  1. Setup environment following this instruction page;

  2. Insert the path REPOSITORY_ROOT_PATH in the file core/config.py;

  3. Download the pretrained P3M-Net(ViTAE-S) model from here (Google Drive | Baidu Wangpan (pw: hxxy))) and unzip to the folder models/pretrained/;

  4. Save your sample images in folder samples/original/.;

  5. Setup parameters in the file scripts/test_samples.sh and run by:

    chmod +x scripts/test_samples.sh

    scripts/test_samples.sh;

  6. The results of alpha matte and transparent color image will be saved in folder samples/result_alpha/. and samples/result_color/..

We show some sample images, the predicted alpha mattes, and their transparent results as below. We use the pretrained P3M-Net(ViTAE-S) model from section P3M-Net and Variants with `Hybrid (1 & 1/2)` test strategy.

Statement

If you are interested in our work, please consider citing the following:

@article{rethink_p3m,
  title={Rethinking Portrait Matting with Pirvacy Preserving},
  author={Ma, Sihan and Li, Jizhizi and Zhang, Jing and Zhang, He and Tao, Dacheng},
  publisher = {arXiv},
  doi={10.48550/ARXIV.2203.16828},
  year={2022}
}

This project is under MIT licence.

For further questions, please contact Sihan Ma at [email protected] or Jizhizi Li at [email protected].

Relevant Projects

[1] Privacy-preserving Portrait Matting, ACM MM, 2021 | Paper | Github
     Jizhizi Li, Sihan Ma, Jing Zhang, Dacheng Tao

[2] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
     Jizhizi Li, Jing Zhang, Stephen J. Maybank, Dacheng Tao

[3] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

Few-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines and results.

Few-Shot-Intent-Detection Few-Shot-Intent-Detection is a repository designed for few-shot intent detection with/without Out-of-Scope (OOS) intents. It

Jian-Guo Zhang 73 Dec 26, 2022
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
Code for our CVPR 2021 paper "MetaCam+DSCE"

Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification (CVPR'21) Introduction Code for our CVPR 2021

FlyingRoastDuck 59 Oct 31, 2022
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 09, 2022
A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

R-YOLOv4 This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detect

94 Dec 03, 2022
Alignment Attention Fusion framework for Few-Shot Object Detection

AAF framework Framework generalities This repository contains the code of the AAF framework proposed in this paper. The main idea behind this work is

Pierre Le Jeune 20 Dec 16, 2022
Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

FRSKD Official implementation for Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation (CVPR-2021) Requirements Pytho

75 Dec 28, 2022
Learning trajectory representations using self-supervision and programmatic supervision.

Trajectory Embedding for Behavior Analysis (TREBA) Implementation from the paper: Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Y

58 Jan 06, 2023
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

NVIDIA Research Projects 2.9k Dec 28, 2022
Deep learned, hardware-accelerated 3D object pose estimation

Isaac ROS Pose Estimation Overview This repository provides NVIDIA GPU-accelerated packages for 3D object pose estimation. Using a deep learned pose e

NVIDIA Isaac ROS 41 Dec 18, 2022
The implementation of the lifelong infinite mixture model

Lifelong infinite mixture model 📋 This is the implementation of the Lifelong infinite mixture model 📋 Accepted by ICCV 2021 Title : Lifelong Infinit

Fei Ye 5 Oct 20, 2022
Art Project "Schrödinger's Game of Life"

Repo of the project "Team Creative Quantum AI: Schrödinger's Game of Life" Installation new conda env: conda create --name qcml python=3.8 conda activ

ℍ◮ℕℕ◭ℍ ℝ∈ᛔ∈ℝ 2 Sep 15, 2022
SAPIEN Manipulation Skill Benchmark

ManiSkill Benchmark SAPIEN Manipulation Skill Benchmark (abbreviated as ManiSkill, pronounced as "Many Skill") is a large-scale learning-from-demonstr

Hao Su's Lab, UCSD 107 Jan 08, 2023
Anonymize BLM Protest Images

Anonymize BLM Protest Images This repository automates @BLMPrivacyBot, a Twitter bot that shows the anonymized images to help keep protesters safe. Us

Stanford Machine Learning Group 40 Oct 13, 2022
Code base of object detection

rmdet code base of object detection. 环境安装: 1. 安装conda python环境 - `conda create -n xxx python=3.7/3.8` - `conda activate xxx` 2. 运行脚本,自动安装pytorch1

3 Mar 08, 2022
这是一个利用facenet和retinaface实现人脸识别的库,可以进行在线的人脸识别。

Facenet+Retinaface:人脸识别模型在Pytorch当中的实现 目录 注意事项 Attention 所需环境 Environment 文件下载 Download 预测步骤 How2predict 参考资料 Reference 注意事项 该库中包含了两个网络,分别是retinaface和

Bubbliiiing 102 Dec 30, 2022
Neural Re-rendering for Full-frame Video Stabilization

NeRViS: Neural Re-rendering for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo

Yu-Lun Liu 9 Jun 17, 2022
Constrained Language Models Yield Few-Shot Semantic Parsers

Constrained Language Models Yield Few-Shot Semantic Parsers This repository contains tools and instructions for reproducing the experiments in the pap

Microsoft 43 Nov 23, 2022
Multi-Output Gaussian Process Toolkit

Multi-Output Gaussian Process Toolkit Paper - API Documentation - Tutorials & Examples The Multi-Output Gaussian Process Toolkit is a Python toolkit f

GAMES 113 Nov 25, 2022
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo

Hugging Face 77.2k Jan 02, 2023