A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Overview

Duplicate Image Detection

Getting Started

  • Install dependencies pip install -r requirements.txt
  • Run service python main.py

Testing

  • Test with pytest

How it Works

alt text

This system uses a perceptual hashing function, similar to Apple's CSAM Detection. Instead of generating image hashes using NeuralHash, it uses a difference hash (dHash), which is simpler and less computationally intensive as it doesn't require neural networks. Since we don't have the same privacy constraints as Apple, we will be using nearest neighbor searches to identify duplicate images.

Difference Hash

dHash is a perceptual hashing function that produces hash values that are resilient to image scaling, as well as changes in color, brightness, and aspect ratio [1]. There are 4 main steps for creating a difference hash for an image:

alt text

  1. Convert to greyscale*
  2. Resize image to (hash_size+1, hash_size)
  3. Calculate horizontal gradient, reducing image size to (hash_size, hash_size)
  4. Assign bits based on horizontal gradient values

*We convert the image to greyscale before resizing for optimal performance

Nearest Neighbors

Image hashes that we want to check for duplicates against will be stored in a binary index for fast and efficient nearest neighbor searches. We will use Hamming distance as a metric to determine the similarity between image hashes, for dHash, distances less than 10 (96.09% similarity) likely indicate similar/duplicate images [1].

References

[1] https://www.hackerfactor.com/blog/?/archives/529-Kind-of-Like-That.html

[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach This is the repo to host the dataset TextSeg and code for TexRNe

SHI Lab 174 Dec 19, 2022
Compare neural networks by their feature similarity

PyTorch Model Compare A tiny package to compare two neural networks in PyTorch. There are many ways to compare two neural networks, but one robust and

Anand Krishnamoorthy 181 Jan 04, 2023
NeuralDiff: Segmenting 3D objects that move in egocentric videos

NeuralDiff: Segmenting 3D objects that move in egocentric videos Project Page | Paper + Supplementary | Video About This repository contains the offic

Vadim Tschernezki 14 Dec 05, 2022
Code for "Learning Graph Cellular Automata"

Learning Graph Cellular Automata This code implements the experiments from the NeurIPS 2021 paper: "Learning Graph Cellular Automata" Daniele Grattaro

Daniele Grattarola 37 Oct 26, 2022
AI pipelines for Nvidia Jetson Platform

Jetson Multicamera Pipelines Easy-to-use realtime CV/AI pipelines for Nvidia Jetson Platform. This project: Builds a typical multi-camera pipeline, i.

NVIDIA AI IOT 96 Dec 23, 2022
A Deep Learning Framework for Neural Derivative Hedging

NNHedge NNHedge is a PyTorch based framework for Neural Derivative Hedging. The following repository was implemented to ease the experiments of our pa

GUIJIN SON 17 Nov 14, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham

61 Dec 26, 2022
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022
[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Few-shot 3D Point Cloud Semantic Segmentation Created by Na Zhao from National University of Singapore Introduction This repository contains the PyTor

117 Dec 27, 2022
This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction:Here! Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

EasyCV-Ellis 618 Dec 27, 2022
Entity-Based Knowledge Conflicts in Question Answering.

Entity-Based Knowledge Conflicts in Question Answering Run Instructions | Paper | Citation | License This repository provides the Substitution Framewo

Apple 35 Oct 19, 2022
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

25.7k Jan 09, 2023
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

waldo.vision 542 Dec 03, 2022
A deep learning object detector framework written in Python for supporting Land Search and Rescue Missions.

AIR: Aerial Inspection RetinaNet for supporting Land Search and Rescue Missions AIR is a deep learning based object detection solution to automate the

Accenture 13 Dec 22, 2022
Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

Rethinking Graph Neural Architecture Search from Message-passing Intro The GNAS can automatically learn better architecture with the optimal depth of

Shaofei Cai 48 Sep 30, 2022
This is the repository for our paper Ditch the Gold Standard: Re-evaluating Conversational Question Answering

Ditch the Gold Standard: Re-evaluating Conversational Question Answering This is the repository for our paper Ditch the Gold Standard: Re-evaluating C

Princeton Natural Language Processing 38 Dec 16, 2022
Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

Hector Kohler 0 Mar 30, 2022