A custom DeepStack model for detecting 16 human actions.

Overview

DeepStack_ActionNET

This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for detecting 16 human actions present in the ActionNET Dataset dataset. Also included in this repository is that dataset with the YOLO annotations.

>> Watch Video Demo

  • Download DeepStack Model and Dataset
  • Create API and Detect Objects
  • Discover more Custom Models
  • Train your own Model

Download DeepStack Model and Dataset

You can download the pre-trained DeepStack_ActionNET model and the annotated dataset via the links below.

Create API and Detect Actions

The Trained Model can detect the following actions in images and videos.

  • calling
  • clapping
  • cycling
  • dancing
  • drinking
  • eating
  • fighting
  • hugging
  • kissing
  • laughing
  • listening-to-music
  • running
  • sitting
  • sleeping
  • texting
  • using-laptop

To start detecting, follow the steps below

  • Install DeepStack: Install DeepStack AI Server with instructions on DeepStack's documentation via https://docs.deepstack.cc

  • Download Custom Model: Download the trained custom model actionnetv2.pt from this GitHub release. Create a folder on your machine and move the downloaded model to this folder.

    E.g A path on Windows Machine C\Users\MyUser\Documents\DeepStack-Models, which will make your model file path C\Users\MyUser\Documents\DeepStack-Models\actionnet.pt

  • Run DeepStack: To run DeepStack AI Server with the custom ActionNET model, run the command that applies to your machine as detailed on DeepStack's documentation linked here.

    E.g

    For a Windows version, you run the command below

    deepstack --MODELSTORE-DETECTION "C\Users\MyUser\Documents\DeepStack-Models" --PORT 80

    For a Linux machine

    sudo docker run -v /home/MyUser/Documents/DeepStack-Models -p 80:5000 deepquestai/deepstack

    Once DeepStack runs, you will see a log like the one below in your Terminal/Console

    That means DeepStack is running your custom actionnet.pt model and now ready to start detecting actions images via the API endpoint http://localhost:80/v1/vision/custom/actionnet or http://your_machine_ip:80/v1/vision/custom/actionnet

  • Detect actions in image: You can detect objects in an image by sending a POST request to the url mentioned above with the paramater image set to an image using any proggramming language or with a tool like POSTMAN. For the purpose of this repository, we have provided a sample Python code below.

    • A sample image can be found in images/test.jpg of this repository

    • Install Python and install the DeepStack Python SDK via the command below

      pip install deepstack_sdk
    • Run the Python file detect.py in this repository.

      python detect.py
    • After the code runs, you will find a new image in images/test_detected.jpg with the detection visualized, with the following results printed in the Terminal/Console.

      Name: dancing
      Confidence: 0.91482425
      x_min: 270
      x_max: 516
      y_min: 18
      y_max: 480
      -----------------------
      

    • You can try running action detection for other images.

Discover more Custom Models

For more custom DeepStack models that has been trained and ready to use, visit the Custom Models sample page on DeepStack's documentation https://docs.deepstack.cc/custom-models-samples/ .

Train your own Model

If you will like to train a custom model yourself, follow the instructions below.

  • Prepare and Annotate: Collect images on and annotate object(s) you plan to detect as detailed here
  • Train your Model: Train the model as detailed here
You might also like...
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.
🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions
An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Agar.io_Q-Learning_AI An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available act

Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.
Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

Tensorflow-Mobile-Generic-Object-Localizer Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label. Ori

Python TFLite scripts for detecting objects of any class in an image without knowing their label.
Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples This project is for the paper "Training Confidence-Calibrated Clas

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"

Beyond the Spectrum Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis" by Yang He, Ning Yu, Margret Keu

Comments
  • How to download a Custom Model action net v2.pt in Deepstack Server Docker?

    How to download a Custom Model action net v2.pt in Deepstack Server Docker?

    Tell me how to load a custom action network model correctly v2.pt in the Deepstack server docker? Did I do the right thing?

    DeepStack: Version 2021.09.01 I created the /model store/detection folders and threw the action net file there v2.pt image

    After the reboot, I got a v1/vision/custom/action net v2 entry in the logs. Did I do the right thing? It just confuses me that there is a v1/vision/custom/action net v2 entry in the logs, and the rest are written like this.

    /v1/vision/face
    /v1/vision/face/recognize
    ....
    

    image

    Is it necessary to enter here as in the case of face and object recognition? image image

    opened by DivanX10 0
Releases(v2)
  • v2(Aug 26, 2021)

    Version 2 of the DeepStack Custom Model for object detection API to detect human actions in images and videos. It detects the following actions

    • calling
    • clapping
    • cycling
    • dancing
    • drinking
    • eating
    • fighting
    • hugging
    • kissing
    • laughing
    • listening-to-music
    • running
    • sitting
    • sleeping
    • texting
    • using-laptop

    Download the model actionnetv2.pt from the Assets section (below) in this release.

    This Model is a YOLOv5x DeepStack custom model and that was trained for 150 epochs, generating a best model with the following evaluation result.

    [email protected]: 0.995 [email protected]: 0.913

    Source code(tar.gz)
    Source code(zip)
    actionnetv2.pt(169.41 MB)
  • v1(Aug 14, 2021)

    A DeepStack Custom Model for object detection API to detect human actions in images and videos. It detects the following actions

    • calling
    • clapping
    • cycling
    • dancing
    • drinking
    • eating
    • fighting
    • hugging
    • kissing
    • laughing
    • listening-to-music
    • running
    • sitting
    • sleeping
    • texting
    • using-laptop

    Download the model actionnet.pt from the Assets section (below) in this release.

    This Model is a YOLOv5x DeepStack custom model and that was trained for 150 epochs, generating a best model with the following evaluation result.

    [email protected]: 0.9858 [email protected]: 0.8051

    Source code(tar.gz)
    Source code(zip)
    actionnet.pt(169.41 MB)
Owner
MOSES OLAFENWA
Software Engineer @Microsoft , A self-Taught computer programmer, Deep Learning, Computer Vision Researcher and Developer. Creator of ImageAI.
MOSES OLAFENWA
🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Hugging Face Optimum 🤗 Optimum is an extension of 🤗 Transformers, providing a set of performance optimization tools enabling maximum efficiency to t

Hugging Face 842 Dec 30, 2022
Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

PlantDoc: A Dataset for Visual Plant Disease Detection This repository contains the Cropped-PlantDoc dataset used for benchmarking classification mode

Pratik Kayal 109 Dec 29, 2022
Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

[AAAI2022] UCTransNet This repo is the official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspectiv

Haonan Wang 199 Jan 03, 2023
Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

English: README-EN.md VRCWatch VRCWatch は、VRChat 内のアバター向けに現在時刻を送信するためのプログラムです。 使

Kosaki Mezumona 17 Nov 30, 2022
AttGAN: Facial Attribute Editing by Only Changing What You Want (IEEE TIP 2019)

News 11 Jan 2020: We clean up the code to make it more readable! The old version is here: v1. AttGAN TIP Nov. 2019, arXiv Nov. 2017 TensorFlow impleme

Zhenliang He 568 Dec 14, 2022
LineBoard - Python+React+MySQL-白板即時系統改善人群行為

LineBoard-白板即時系統改善人群行為 即時顯示實驗室的使用狀況,並遠端預約排隊,以此來改善人們的工作效率 程式架構 運作流程 使用者先至該實驗室網站預約

Bo-Jyun Huang 1 Feb 22, 2022
Dataloader tools for language modelling

Installation: pip install lm_dataloader Design Philosophy A library to unify lm dataloading at large scale Simple interface, any tokenizer can be inte

5 Mar 25, 2022
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Daft-Exprt - PyTorch Implementation PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis The

Keon Lee 47 Dec 18, 2022
Code of the paper "Part Detector Discovery in Deep Convolutional Neural Networks" by Marcel Simon, Erik Rodner and Joachim Denzler

Part Detector Discovery This is the code used in our paper "Part Detector Discovery in Deep Convolutional Neural Networks" by Marcel Simon, Erik Rodne

Computer Vision Group Jena 17 Feb 22, 2022
A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning This is a small repo illustrating how to use WebDataset on ImageNet. usi

50 Dec 16, 2022
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper

Emanuel Metzenthin 3 Dec 16, 2022
Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i

349 Dec 29, 2022
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

3 Dec 13, 2021
Camview - A CLI-tool used to stream CCTV online footage based on URL params

CamView A CLI-tool used to stream CCTV online footage based on URL params Get St

Finn Lancaster 54 Dec 09, 2022
Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

acLSTM_motion This folder contains an implementation of acRNN for the CMU motion database written in Pytorch. See the following links for more backgro

Yi_Zhou 61 Sep 07, 2022
AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022
Generate Cartoon Images using Generative Adversarial Network

AvatarGAN ✨ Generate Cartoon Images using DC-GAN Deep Convolutional GAN is a generative adversarial network architecture. It uses a couple of guidelin

Aakash Jhawar 50 Dec 29, 2022
Measures input lag without dedicated hardware, performing motion detection on recorded or live video

What is InputLagTimer? This tool can measure input lag by analyzing a video where both the game controller and the game screen can be seen on a webcam

Bruno Gonzalez 4 Aug 18, 2022
A Lightweight Experiment & Resource Monitoring Tool 📺

Lightweight Experiment & Resource Monitoring 📺 "Did I already run this experiment before? How many resources are currently available on my cluster?"

170 Dec 28, 2022
🎃 Core identification module of AI powerful point reading system platform.

ppReader-Kernel Intro Core identification module of AI powerful point reading system platform. Usage 硬件: Windows10、GPU:nvdia GTX 1060 、普通RBG相机 软件: con

CrashKing 1 Jan 11, 2022