As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

Overview

HAKE-Action

HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It includes reproduced SOTA models and their HAKE-enhanced versions. HAKE-Action is authored by Yong-Lu Li, Xinpeng Liu, Liang Xu, Cewu Lu. Currently, it is manintained by Yong-Lu Li, Xinpeng Liu and Liang Xu.

News: (2021.10.06) Our extended version of SymNet is accepted by TPAMI! Paper and code are coming soon.

(2021.2.7) Upgraded HAKE-Activity2Vec is released! Images/Videos --> human box + ID + skeleton + part states + action + representation. [Description]

Full demo: [YouTube], [bilibili]

(2021.1.15) Our extended version of TIN (Transferable Interactiveness Network) is accepted by TPAMI! New paper and code will be released soon.

(2020.10.27) The code of IDN (Paper) in NeurIPS'20 is released!

(2020.6.16) Our larger version HAKE-Large (>120K images, activity and part state labels) is released!

We released the HAKE-HICO (image-level part state labels upon HICO) and HAKE-HICO-DET (instance-level part state labels upon HICO-DET). The corresponding data can be found here: HAKE Data.

  • Paper is here.
  • More data and part states (e.g., upon AVA, more kinds of action categories, more rare actions...) are coming.
  • We will keep updating HAKE-Action to include more SOTA models and their HAKE-enhanced versions.

Data Mode

  • HAKE-HICO (PaStaNet* mode in paper): image-level, add the aggression of all part states in an image (belong to one or multiple active persons), compared with original HICO, the only additional labels are image-level human body part states.

  • HAKE-HICO-DET (PaStaNet* in paper): instance-level, add part states for each annotated persons of all images in HICO-DET, the only additional labels are instance-level human body part states.

  • HAKE-Large (PaStaNet in paper): contains more than 120K images, action labels and the corresponding part state labels. The images come from the existing action datasets and crowdsourcing. We mannully annotated all the active persons with our novel part-level semantics.

  • GT-HAKE (GT-PaStaNet* in paper): GT-HAKE-HICO and G-HAKE-HICO-DET. It means that we use the part state labels as the part state prediction. That is, we can perfectly estimate the body part states of a person. Then we use them to infer the instance activities. This mode can be seen as the upper bound of our HAKE-Action. From the results below we can find that, the upper bound is far beyond the SOTA performance. Thus, except for the current study on the conventional instance-level method, continue promoting part-level method based on HAKE would be a very promising direction.

Notion

Activity2Vec and PaSta-R are our part state based modules, which operate action inference based on part semantics, different from previous instance semantics. For example, Pairwise + HAKE-HICO pre-trained Activity2Vec + Linear PaSta-R (the seventh row) achieves 45.9 mAP on HICO. More details can be found in our CVPR2020 paper: PaStaNet: Toward Human Activity Knowledge Engine.

Code

The two versions of HAKE-Action are relesased in two branches of this repo:

Models on HICO

Instance-level +Activity2Vec +PaSta-R mAP [email protected] [email protected] [email protected]
R*CNN - - 28.5 - - -
Girdhar et.al. - - 34.6 - - -
Mallya et.al. - - 36.1 - - -
Pairwise - - 39.9 13.0 19.8 22.3
- HAKE-HICO Linear 44.5 26.9 30.0 30.7
Mallya et.al. HAKE-HICO Linear 45.0 26.5 29.1 30.3
Pairwise HAKE-HICO Linear 45.9 26.2 30.6 31.8
Pairwise HAKE-HICO MLP 45.6 26.0 30.8 31.9
Pairwise HAKE-HICO GCN 45.6 25.2 30.0 31.4
Pairwise HAKE-HICO Seq 45.9 25.3 30.2 31.6
Pairwise HAKE-HICO Tree 45.8 24.9 30.3 31.8
Pairwise HAKE-Large Linear 46.3 24.7 31.8 33.1
Pairwise HAKE-Large Linear 46.3 24.7 31.8 33.1
Pairwise GT-HAKE-HICO Linear 65.6 47.5 55.4 56.6

Models on HICO-DET

Using Object Detections from iCAN

Instance-level +Activity2Vec +PaSta-R Full(def) Rare(def) None-Rare(def) Full(ko) Rare(ko) None-Rare(ko)
iCAN - - 14.84 10.45 16.15 16.26 11.33 17.73
TIN - - 17.03 13.42 18.11 19.17 15.51 20.26
iCAN HAKE-HICO-DET Linear 19.61 17.29 20.30 22.10 20.46 22.59
TIN HAKE-HICO-DET Linear 22.12 20.19 22.69 24.06 22.19 24.62
TIN HAKE-Large Linear 22.65 21.17 23.09 24.53 23.00 24.99
TIN GT-HAKE-HICO-DET Linear 34.86 42.83 32.48 35.59 42.94 33.40

Models on AVA (Frame-based)

Method +Activity2Vec +PaSta-R mAP
AVA-TF-Baseline - - 11.4
LFB-Res-50-baseline - - 22.2
LFB-Res-101-baseline - - 23.3
AVA-TF-Baeline HAKE-Large Linear 15.6
LFB-Res-50-baseline HAKE-Large Linear 23.4
LFB-Res-101-baseline HAKE-Large Linear 24.3

Models on V-COCO

Method +Activity2Vec +PaSta-R AP(role), Scenario 1 AP(role), Scenario 2
iCAN - - 45.3 52.4
TIN - - 47.8 54.2
iCAN HAKE-Large Linear 49.2 55.6
TIN HAKE-Large Linear 51.0 57.5

Training Details

We first pre-train the Activity2Vec and PaSta-R with activities and PaSta labels. Then we change the last FC in PaSta-R to fit the activity categories of the target dataset. Finally, we freeze Activity2Vec and fine-tune PaSta-R on the train set of the target dataset. Here, HAKE works like the ImageNet and Activity2Vec is used as a pre-trained knowledge engine to promote other tasks.

Citation

If you find our work useful, please consider citing:

@inproceedings{li2020pastanet,
  title={PaStaNet: Toward Human Activity Knowledge Engine},
  author={Li, Yong-Lu and Xu, Liang and Liu, Xinpeng and Huang, Xijie and Xu, Yue and Wang, Shiyi and Fang, Hao-Shu and Ma, Ze and Chen, Mingyang and Lu, Cewu},
  booktitle={CVPR},
  year={2020}
}
@inproceedings{li2019transferable,
  title={Transferable Interactiveness Knowledge for Human-Object Interaction Detection},
  author={Li, Yong-Lu and Zhou, Siyuan and Huang, Xijie and Xu, Liang and Ma, Ze and Fang, Hao-Shu and Wang, Yanfeng and Lu, Cewu},
  booktitle={CVPR},
  year={2019}
}
@inproceedings{lu2018beyond,
  title={Beyond holistic object recognition: Enriching image understanding with part states},
  author={Lu, Cewu and Su, Hao and Li, Yonglu and Lu, Yongyi and Yi, Li and Tang, Chi-Keung and Guibas, Leonidas J},
  booktitle={CVPR},
  year={2018}
}

HAKE

HAKE[website] is a new large-scale knowledge base and engine for human activity understanding. HAKE provides elaborate and abundant body part state labels for active human instances in a large scale of images and videos. With HAKE, we boost the action understanding performance on widely-used human activity benchmarks. Now we are still enlarging and enriching it, and looking forward to working with outstanding researchers around the world on its applications and further improvements. If you have any pieces of advice or interests, please feel free to contact Yong-Lu Li ([email protected]).

If you get any problems or if you find any bugs, don't hesitate to comment on GitHub or make a pull request!

HAKE-Action is freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an e-mail. We will send the detail agreement to you.

Owner
Yong-Lu Li
Ph.D. CV_Robotics
Yong-Lu Li
Official PyTorch Implementation of SSMix (Findings of ACL 2021)

SSMix: Saliency-based Span Mixup for Text Classification (Findings of ACL 2021) Official PyTorch Implementation of SSMix | Paper Abstract Data augment

Clova AI Research 52 Dec 27, 2022
SegNet-Basic with Keras

SegNet-Basic: What is Segnet? Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-wise Image Segmentation Segnet = (Encoder + Decoder)

Yad Konrad 81 Jun 30, 2022
A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

Somnus `Chen 2 Jun 09, 2022
A simple interface for editing natural photos with generative neural networks.

Neural Photo Editor A simple interface for editing natural photos with generative neural networks. This repository contains code for the paper "Neural

Andy Brock 2.1k Dec 29, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 09, 2021
Efficient neural networks for analog audio effect modeling

micro-TCN Efficient neural networks for audio effect modeling

Christian Steinmetz 94 Dec 29, 2022
A fast, dataset-agnostic, deep visual search engine for digital art history

imgs.ai imgs.ai is a fast, dataset-agnostic, deep visual search engine for digital art history based on neural network embeddings. It utilizes modern

Fabian Offert 5 Dec 14, 2022
A command line simple note taking app

Why yet another note taking program? note was designed with a very specific target in mind: me, and my 2354 scraps of paper. It runs from the command

64 Nov 20, 2022
G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Single Node Injection Attack against Graph Neural Networks This repository is our Pytorch implementation of our paper: Single Node Injection Attack ag

Shuchang Tao 18 Nov 21, 2022
A deep learning object detector framework written in Python for supporting Land Search and Rescue Missions.

AIR: Aerial Inspection RetinaNet for supporting Land Search and Rescue Missions AIR is a deep learning based object detection solution to automate the

Accenture 13 Dec 22, 2022
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Table of Content Introduction Getting Started Datasets Installation Experiments Training & Testing Pretrained models Texture fine-tuning Demo Toward R

VinAI Research 42 Dec 05, 2022
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
LEAP: Learning Articulated Occupancy of People

LEAP: Learning Articulated Occupancy of People Paper | Video | Project Page This is the official implementation of the CVPR 2021 submission LEAP: Lear

Neural Bodies 60 Nov 18, 2022
Official code for the ICLR 2021 paper Neural ODE Processes

Neural ODE Processes Official code for the paper Neural ODE Processes (ICLR 2021). Abstract Neural Ordinary Differential Equations (NODEs) use a neura

Cristian Bodnar 50 Oct 28, 2022
DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

DPT This repo is the official implementation of DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021). We provide code and model

CASIA-IVA-Lab 111 Dec 21, 2022
Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

Flask Google Maps Easy to use Google Maps in your Flask application requires Jinja Flask A google api key get here Contribute To contribute with the p

Flask Extensions 611 Dec 05, 2022
Source code for CAST - Crisis Domain Adaptation Using Sequence-to-sequence Transformers (Accepted to ISCRAM 2021, CorePaper).

Source code for CAST: Crisis Domain Adaptation UsingSequence-to-sequenceTransformers (Paper, BibTeX, Accepted to ISCRAM 2021, CorePaper) Quick start D

Congcong Wang 0 Jul 14, 2021
A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.

europilot Overview Europilot is an open source project that leverages the popular Euro Truck Simulator(ETS2) to develop self-driving algorithms. A con

1.4k Jan 04, 2023
Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Self-Tuning for Data-Efficient Deep Learning This repository contains the implementation code for paper: Self-Tuning for Data-Efficient Deep Learning

THUML @ Tsinghua University 101 Dec 11, 2022