A computer vision pipeline to identify the "icons" in Christian paintings

Overview

Christian-Iconography

Open In Colab Screenshot from 2022-01-08 18-26-30

A computer vision pipeline to identify the "icons" in Christian paintings.

A bit about iconography.

Iconography is related to identifying the subject itself in the image. So, for instance when I say Christian Iconography I would mean that I am trying to identify some objects like crucifix or mainly in this project the saints!

Inspiration

I was looking for some interesting problem to solve and I came across RedHenLab's barnyard of projects and it had some really wonderful ideas there and this particular one intrigued me. On the site they didn't have much progress on it as the datasets were not developed on this subject but after surfing around I found something and just like that I got started!

Dataset used.

The project uses the ArtDL dataset which contains 42,479 images of artworks portraying Christian saints, divided in 10 classes: Saint Dominic (iconclass 11HH(DOMINIC)), Saint Francis of Assisi (iconclass 11H(FRANCIS)), Saint Jerome (iconclass 11H(JEROME)), Saint John the Baptist (iconclass 11H(JOHN THE BAPTIST)), Saint Anthony of Padua (iconclass 11H(ANTONY OF PADUA), Saint Mary Magdalene (iconclass 11HH(MARY MAGDALENE)), Saint Paul (iconclass 11H(PAUL)), Saint Peter (iconclass 11H(PETER)), Saint Sebastian (iconclass 11H(SEBASTIAN)) and Virgin Mary (iconclass 11F). All images are associated with high-level annotations specifying which iconography classes appear in them (from a minimum of 1 class to a maximum of 7 classes).

Sources

Screenshot from 2022-01-08 18-08-56

Preprocessing steps.

All the images were first padded so that the resolution is sort of intact when the image is resized. A dash of normalization and some horizontal flips and the dataset is ready to be eaten/trained on by our model xD.

Architecture used.

As mentioned the ArtDL dataset has around 43k images and hence training it completely wouldn't make sense. Hence a ResNet50 pretrained model was used.

But there is a twist.

Instead of just having the final classifying layer trained we only freeze the initial layer as it has gotten better at recognizing patterns from a lot of images it might have trained on. And then we fine-tune the deeper layers so that it learns the art after the initial abstraction. Another deviation is to replace the final linear layer by 1x1 conv layer to make the classification.

Quantiative Results.

Training

I trained the network for 10 epochs which took around 3 hours and used Stochastic Gradient Descent with LR=0.01 and momentum 0.9. The accuracy I got was 64% on the test set which can be further improved.

Classification Report

Screenshot from 2022-01-10 22-07-52

From the classification report it is clear that Saint MARY has the most number of samples in the training set and the precision for that is high. On the other hand other samples are low in number and hence their scores are low and hence we can't infer much except the fact that we need to oversample some of these classes so that we can gain more meaningful resuls w.r.t accuracy and of course these metrics as well

Qualitative Results

We try an image of Saint Dominic and see what our classifier is really learning.

Screenshot from 2022-01-10 22-10-37

Saliency Map

Screenshot from 2022-01-10 22-12-31

We can notice that regions around are more lighter than elsewhere which could mean that our classifier at least knows where to look :p

Guided-Backpropagation

Screenshot from 2022-01-10 22-14-26

So what really guided backprop does is that it points out the positve influences while classifiying an image. From this result we can see that it is really ignoring the padding applied and focussing more on the body and interesting enough the surroundings as well

Grad-CAM!

Screenshot from 2022-01-10 22-15-27

As expected the Grad-CAM when used shows the hot regions in our images and it is around the face and interesting enough the surrounding so maybe it could be that surroundings do have a role-play in type of saint?

Possible improvements.

  • Finding more datasets
  • Or working on the architecture maybe?
  • Using GANs to generate samples and make classifier stronger

Citations

@misc{milani2020data,
title={A Data Set and a Convolutional Model for Iconography Classification in Paintings},
author={Federico Milani and Piero Fraternali},
eprint={2010.11697},
archivePrefix={arXiv},
primaryClass={cs.CV},
year={2020}
}

RedhenLab's barnyard of projects

Owner
Rishab Mudliar
AKA Start At The Beginning.
Rishab Mudliar
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Microsoft 1.3k Dec 30, 2022
A booklet on machine learning systems design with exercises

Machine Learning Systems Design Read this booklet here. This booklet covers four main steps of designing a machine learning system: Project setup Data

Chip Huyen 7.6k Jan 08, 2023
Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Reproduce ResNet-v2 using MXNet Requirements Install MXNet on a machine with CUDA GPU, and it's better also installed with cuDNN v5 Please fix the ran

Wei Wu 531 Dec 04, 2022
an implementation of softmax splatting for differentiable forward warping using PyTorch

softmax-splatting This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame I

Simon Niklaus 338 Dec 28, 2022
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

71 Dec 19, 2022
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more

Bayesian Neural Networks Pytorch implementations for the following approximate inference methods: Bayes by Backprop Bayes by Backprop + Local Reparame

1.4k Jan 07, 2023
Repository for the COLING 2020 paper "Explainable Automated Fact-Checking: A Survey."

Explainable Fact Checking: A Survey This repository and the accompanying webpage contain resources for the paper "Explainable Fact Checking: A Survey"

Neema Kotonya 42 Nov 17, 2022
Convert ONNX model graph to Keras model format.

Convert ONNX model graph to Keras model format.

Grigory Malivenko 175 Dec 28, 2022
WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking [Paper Link] Abstract In this work, we contribute a new million-scale Un

25 Jan 01, 2023
Transfer SemanticKITTI labeles into other dataset/sensor formats.

LiDAR-Transfer Transfer SemanticKITTI labeles into other dataset/sensor formats. Content Convert datasets (NUSCENES, FORD, NCLT) to KITTI format Minim

Photogrammetry & Robotics Bonn 64 Nov 21, 2022
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 49 Nov 28, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 09, 2021
A faster pytorch implementation of faster r-cnn

A Faster Pytorch Implementation of Faster R-CNN Write at the beginning [05/29/2020] This repo was initaited about two years ago, developed as the firs

Jianwei Yang 7.1k Jan 01, 2023
Source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals.

PatchGraph This repository contains the source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals. Installation Creat

Paloma Sodhi 11 Dec 15, 2022
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022
PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

PIXOR: Real-time 3D Object Detection from Point Clouds This is a custom implementation of the paper from Uber ATG using PyTorch 1.0. It represents the

Philip Huang 270 Dec 14, 2022
Implements Stacked-RNN in numpy and torch with manual forward and backward functions

Recurrent Neural Networks Implements simple recurrent network and a stacked recurrent network in numpy and torch respectively. Both flavours implement

Vishal R 1 Nov 16, 2021
Collision risk estimation using stochastic motion models

collision_risk_estimation Collision risk estimation using stochastic motion models. This is a new approach, based on stochastic models, to predict the

Unmesh 7 Jun 26, 2022