Search and filter videos based on objects that appear in them using convolutional neural networks

Overview

Thingscoop: Utility for searching and filtering videos based on their content

Description

Thingscoop is a command-line utility for analyzing videos semantically - that means searching, filtering, and describing videos based on objects, places, and other things that appear in them.

When you first run thingscoop on a video file, it uses a convolutional neural network to create an "index" of what's contained in the every second of the input by repeatedly performing image classification on a frame-by-frame basis. Once an index for a video file has been created, you can search (i.e. get the start and end times of the regions in the video matching the query) and filter (i.e. create a supercut of the matching regions) the input using arbitrary queries. Thingscoop uses a very basic query language that lets you to compose queries that test for the presence or absence of labels with the logical operators ! (not), || (or) and && (and). For example, to search a video the presence of the sky and the absence of the ocean: thingscoop search 'sky && !ocean' <file>.

Right now two models are supported by thingscoop: vgg_imagenet uses the architecture described in "Very Deep Convolutional Networks for Large-Scale Image Recognition" to recognize objects from the ImageNet database, and googlenet_places uses the architecture described in "Going Deeper with Convolutions" to recognize settings and places from the MIT Places database. You can specify which model you'd like to use by running thingscoop models use <model>, where <model> is either vgg_imagenet or googlenet_places. More models will be added soon.

Thingscoop is based on Caffe, an open-source deep learning framework.

Installation

  1. Install ffmpeg, imagemagick, and ghostscript: brew install ffmpeg imagemagick ghostscript (Mac OS X) or apt-get install ffmpeg imagemagick ghostscript (Ubuntu).
  2. Follow the installation instructions on the Caffe Installation page.
  3. Make sure you build the Python bindings by running make pycaffe (on Caffe's directory).
  4. Set the environment variable CAFFE_ROOT to point to Caffe's directory: export CAFFE_ROOT=[Caffe's directory].
  5. Install thingscoop: easy_install thingscoop or pip install thingscoop.

Usage

thingscoop search <query> <files...>

Print the start and end times (in seconds) of the regions in <files> that match <query>. Creates an index for <file> using the current model if it does not exist.

Example output:

$ thingscoop search violin waking_life.mp4
/Users/anastasis/Downloads/waking_life.mp4 148.000000 162.000000
/Users/anastasis/Downloads/waking_life.mp4 176.000000 179.000000
/Users/anastasis/Downloads/waking_life.mp4 180.000000 186.000000
/Users/anastasis/Downloads/waking_life.mp4 189.000000 190.000000
/Users/anastasis/Downloads/waking_life.mp4 192.000000 200.000000
/Users/anastasis/Downloads/waking_life.mp4 211.000000 212.000000
/Users/anastasis/Downloads/waking_life.mp4 222.000000 223.000000
/Users/anastasis/Downloads/waking_life.mp4 235.000000 243.000000
/Users/anastasis/Downloads/waking_life.mp4 247.000000 249.000000
/Users/anastasis/Downloads/waking_life.mp4 251.000000 253.000000
/Users/anastasis/Downloads/waking_life.mp4 254.000000 258.000000

####thingscoop filter <query> <files...>

Generate a video compilation of the regions in the <files> that match <query>. Creates index for <file> using the current model if it does not exist.

Example output:

thingscoop sort <file>

Create a compilation video showing examples for every label recognized in the video (in alphabetic order). Creates an index for <file> using the current model if it does not exist.

Example output:

thingscoop describe <file>

Print every label that appears in <file> along with the number of times it appears. Creates an index for <file> using the current model if it does not exist.

thingscoop preview <file>

Create a window that plays the input video <file> while also displaying the labels the model recognizes on every frame.

$ thingscoop describe koyaanisqatsi.mp4 -m googlenet_places
sky 405
skyscraper 363
canyon 141
office_building 130
highway 78
lighthouse 66
hospital 64
desert 59
shower 49
volcano 45
underwater 44
airport_terminal 43
fountain 39
runway 36
assembly_line 35
aquarium 34
fire_escape 34
music_studio 32
bar 28
amusement_park 28
stage 26
wheat_field 25
butchers_shop 25
engine_room 24
slum 20
butte 20
igloo 20
...etc

thingscoop index <file>

Create an index for <file> using the current model if it does not exist.

thingscoop models list

List all models currently available in Thingscoop.

$ thingscoop models list
googlenet_imagenet            Model described in the paper "Going Deeper with Convolutions" trained on the ImageNet database
googlenet_places              Model described in the paper "Going Deeper with Convolutions" trained on the MIT Places database
vgg_imagenet                  16-layer model described in the paper "Return of the Devil in the Details: Delving Deep into Convolutional Nets" trained on the ImageNet database

thingscoop models info <model>

Print more detailed information about <model>.

$ thingscoop models info googlenet_places
Name: googlenet_places
Description: Model described in the paper "Going Deeper with Convolutions" trained on the MIT Places database
Dataset: MIT Places

thingscoop models freeze

List all models that have already been downloaded.

$ thingscoop models freeze
googlenet_places
vgg_imagenet

thingscoop models current

Print the model that is currently in use.

$ thingscoop models current
googlenet_places

thingscoop models use <model>

Set the current model to <model>. Downloads that model locally if it hasn't been downloaded already.

thingscoop models download <model>

Download the model <model> locally.

thingscoop models remove <model>

Remove the model <model> locally.

thingscoop models clear

Remove all models stored locally.

thingscoop labels list

Print all the labels used by the current model.

$ thingscoop labels list
abacus
abaya
abstraction
academic gown
accessory
accordion
acorn
acorn squash
acoustic guitar
act
actinic radiation
action
activity
adhesive bandage
adjudicator
administrative district
admiral
adornment
adventurer
advocate
...

thingscoop labels search <regexp>

Print all the labels supported by the current model that match the regular expression <regexp>.

$ thingscoop labels search instrument$
beating-reed instrument
bowed stringed instrument
double-reed instrument
free-reed instrument
instrument
keyboard instrument
measuring instrument
medical instrument
musical instrument
navigational instrument
negotiable instrument
optical instrument
percussion instrument
scientific instrument
stringed instrument
surveying instrument
wind instrument
...

Full usage options

thingscoop - Command-line utility for searching and filtering videos based on their content

Usage:
  thingscoop filter <query> <files>... [-o <output_path>] [-m <model>] [-s <sr>] [-c <mc>] [--recreate-index] [--gpu-mode] [--open]
  thingscoop search <query> <files>... [-o <output_path>] [-m <model>] [-s <sr>] [-c <mc>] [--recreate-index] [--gpu-mode] 
  thingscoop describe <file> [-n <words>] [-m <model>] [--recreate-index] [--gpu-mode] [-c <mc>]
  thingscoop index <files> [-m <model>] [-s <sr>] [-c <mc>] [-r <ocr>] [--recreate-index] [--gpu-mode] 
  thingscoop sort <file> [-m <model>] [--gpu-mode] [--min-confidence <ct>] [--max-section-length <ms>] [-i <ignore>] [--open]
  thingscoop preview <file> [-m <model>] [--gpu-mode] [--min-confidence <ct>]
  thingscoop labels list [-m <model>]
  thingscoop labels search <regexp> [-m <model>]
  thingscoop models list
  thingscoop models info <model>
  thingscoop models freeze
  thingscoop models current
  thingscoop models use <model>
  thingscoop models download <model>
  thingscoop models remove <model>
  thingscoop models clear

Options:
  --version                       Show version.
  -h --help                       Show this screen.
  -o --output <dst>               Output file for supercut
  -s --sample-rate <sr>           How many frames to classify per second (default = 1)
  -c --min-confidence <mc>        Minimum prediction confidence required to consider a label (default depends on model)
  -m --model <model>              Model to use (use 'thingscoop models list' to see all available models)
  -n --number-of-words <words>    Number of words to describe the video with (default = 5)
  -t --max-section-length <ms>    Max number of seconds to show examples of a label in the sorted video (default = 5)
  -r --min-occurrences <ocr>      Minimum number of occurrences of a label in video required for it to be shown in the sorted video (default = 2)
  -i --ignore-labels <labels>     Labels to ignore when creating the sorted video video
  --title <title>                 Title to show at the beginning of the video (sort mode only)
  --gpu-mode                      Enable GPU mode
  --recreate-index                Recreate object index for file if it already exists
  --open                          Open filtered video after creating it (OS X only)

CHANGELOG

0.2 (8/16/2015)

  • Added sort option for creating a video compilation of all labels appearing in a video
  • Now using JSON for the index files

0.1 (8/5/2015)

  • Conception

License

MIT

Owner
Anastasis Germanidis
🎭
Anastasis Germanidis
Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

61 Jan 01, 2023
[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

Linyi Jin 89 Jan 05, 2023
Adaptable tools to make reinforcement learning and evolutionary computation algorithms.

Pearl The Parallel Evolutionary and Reinforcement Learning Library (Pearl) is a pytorch based package with the goal of being excellent for rapid proto

38 Jan 01, 2023
Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics. By Andres Milioto @ University of Bonn. (for the new P

Photogrammetry & Robotics Bonn 314 Dec 30, 2022
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

65 Dec 22, 2022
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 01, 2023
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion Read our ICRA 2021 paper here. Check out the 3 minute video for the quick intro or the full prese

Aleksandr Kim 276 Dec 30, 2022
RoMa: A lightweight library to deal with 3D rotations in PyTorch.

RoMa: A lightweight library to deal with 3D rotations in PyTorch. RoMa (which stands for Rotation Manipulation) provides differentiable mappings betwe

NAVER 90 Dec 27, 2022
Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"

BAM and CBAM Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)" Updat

Jongchan Park 1.7k Jan 01, 2023
Complete* list of autonomous driving related datasets

AD Datasets Complete* and curated list of autonomous driving related datasets Contributing Contributions are very welcome! To add or update a dataset:

Daniel Bogdoll 13 Dec 19, 2022
Rank1 Conversation Emotion Detection Task

Rank1-Conversation_Emotion_Detection_Task accuracy macro-f1 recall 0.826 0.7544 0.719 基于预训练模型和时序预测模型的对话情感探测任务 1 摘要 针对对话情感探测任务,本文将其分为文本分类和时间序列预测两个子任务,分

Yuchen Han 2 Nov 28, 2021
Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes (CVPR 2021) Project page | Paper | Colab | Colab for Drawing App Rethinking Style

CompVis Heidelberg 153 Jan 04, 2023
Crosslingual Segmental Language Model

Crosslingual Segmental Language Model This repository contains the code from Multilingual unsupervised sequence segmentation transfers to extremely lo

C.M. Downey 1 Jun 13, 2022
Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Industrial KNN-based Anomaly Detection ⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py This repo aims to reproduce the results of

aventau 102 Dec 26, 2022
Pytorch for Segmentation

Pytorch for Semantic Segmentation This repo has been deprecated currently and I will not maintain it. Meanwhile, I strongly recommend you can refer to

ycszen 411 Nov 22, 2022
DLWP: Deep Learning Weather Prediction

DLWP: Deep Learning Weather Prediction DLWP is a Python project containing data-

Kushal Shingote 3 Aug 14, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

TableauBits 3 May 29, 2022
Rust bindings for the C++ api of PyTorch.

tch-rs Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorc

Laurent Mazare 2.3k Dec 30, 2022
Structure-Preserving Deraining with Residue Channel Prior Guidance (ICCV2021)

SPDNet Structure-Preserving Deraining with Residue Channel Prior Guidance (ICCV2021) Requirements Linux Platform NVIDIA GPU + CUDA CuDNN PyTorch == 0.

41 Dec 12, 2022