Analyzing basic network responses to novel classes

Overview

novelty-detection

Analyzing how AlexNet responds to novel classes with varying degrees of similarity to pretrained classes from ImageNet.

If you find this work helpful in your research, please cite:

Eshed, N. (2020). Novelty detection and analysis in convolutional neural networks (Accession No. 27994027)[Master's thesis, Cornell University]. ProQuest Dissertations & Theses Global.

@mastersthesis{eshed_novelty_detection,
  author={Noam Eshed},
  title={Novelty detection and analysis in convolutional neural networks},
  school={Cornell University},
  year={2020},
  publisher={ProQuest Dissertations & Theses Global}
}

Data

in_out_class.csv

This is hand-annotated data from iNaturalist. The most up-to-date version can be found here The data taken directly from iNaturalist includes the biological groups and scientific names of natural things. Annotators included the common English name(s) for each creature, their relation to ImageNet, any relevant notes, and their initials. For details regarding annotation guidelines, see this link.

alexnet_inat_results/

inat_results_top_choice.json

This json file contains the results from testing a pre-trained AlexNet (trained on ImageNet) on images from iNaturalist. It only includes the top one result (i.e. the label chosen by the network) for each image in iNaturalist, and so is most efficient when looking into the distribution of labels chosen for a certain type of creature.

Biological group files

Each of these folders contains all of the results of testing a pre-trained AlexNet (trained on ImageNet) on images from iNaturalist in the given biological group. This includes all possible labels, their scores, and their confidence values for each image. Since ImageNet has 1000 classes, that means that each image in iNaturalist has 3 vectors of length 1000 to store the label, score, and confidence value information. Each of the files within these folders contains the data for a single species within the given biological group

Code

class_in_or_out.py

This script plots the distribution of the top n CNN labels for all (or part) of the image data. Looking at all species of interest, it averages the frequency of the top n labels. Note that the top n labels are not necessarily in the same order for each species, and so the labels themselves are ignored.

The species each fall under one of four annotated ImageNet relationship categories: in ImageNet, not in ImageNet, parent in ImageNet, and relative in Imagenet. These annotations are taken from in_out_class.csv. The plots may be stratified by these relationship categories.

As an example, this code can plot the frequency of the top 10 labels over all bird images, and split by the species' relationship to Imagenet. The resulting plot will show the average distribution of label frequencies. The top label frequency, for example, is the frequency of the top occuring label over all images averaged over a given species, regardless of what that top label actually was.

This plot shows the frequency of the top 20 labels over all bird species in iNaturalist:

Bird Label Frequencies

plot_result_distribution.py

This script plots the distribution of CNN labels over each species. It does so by counting the number of occurrences of each label over many images of that species and normalizing the result to get a frequency distribution rather than an occurrence count distribution. There is an option to color and label each point according to the average confidence of the label. This can help us understand what common mistakes the network makes when classifying images of a given species.

In this example plot, we can see the distribution of all labels guessed by the network in the set of African Penguin images. It shows that approximately 19% of the images are classified as magpie, 19% as goose, etc. Interestingly, the king_penguin label is only awarded to 5% of the images and is tied for the 5th most common label.

African Penguin Distribution

alexnet_novelty.py

This script tests AlexNet (pretrained on ImageNet) on all of the data from iNaturalist and saves the result into the alexnet_inat_results/ folder.

Owner
Noam Eshed
Noam Eshed
Viperdb - A tiny log-structured key-value database written in pure Python

ViperDB 🐍 ViperDB is a lightweight embedded key-value store written in pure Pyt

17 Oct 17, 2022
Patch-Diffusion Code (AAAI2022)

Patch-Diffusion This is an official PyTorch implementation of "Patch Diffusion: A General Module for Face Manipulation Detection" in AAAI2022. Require

H 7 Nov 02, 2022
Official implementation of "Articulation Aware Canonical Surface Mapping"

Articulation-Aware Canonical Surface Mapping Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani Paper Project Page Requirements Python

Nilesh Kulkarni 56 Dec 16, 2022
Rank 3 : Source code for OPPO 6G Data Generation Challenge

OPPO 6G Data Generation with an E2E Framework Homepage of OPPO 6G Data Generation Challenge Datasets H1_32T4R.mat H2_32T4R.mat Please put the original

Sen Pei 97 Jan 07, 2023
A fast and easy to use, moddable, Python based Minecraft server!

PyMine PyMine - The fastest, easiest to use, Python-based Minecraft Server! Features Note: This list is not always up to date, and doesn't contain all

PyMine 144 Dec 30, 2022
Spam your friends and famly and when you do your famly will disown you and you will have no friends.

SpamBot9000 Spam your friends and family and when you do your family will disown you and you will have no friends. Terms of Use Disclaimer: Please onl

DJ15 0 Jun 09, 2022
The hippynn python package - a modular library for atomistic machine learning with pytorch.

The hippynn python package - a modular library for atomistic machine learning with pytorch. We aim to provide a powerful library for the training of a

Los Alamos National Laboratory 37 Dec 29, 2022
Repository accompanying the "Sign Pose-based Transformer for Word-level Sign Language Recognition" paper

by Matyáš Boháček and Marek Hrúz, University of West Bohemia Should you have any questions or inquiries, feel free to contact us here. Repository acco

Matyáš Boháček 30 Dec 30, 2022
Supporting code for short YouTube series Neural Networks Demystified.

Neural Networks Demystified Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the tex

Stephen 1.3k Dec 23, 2022
MIMIC Code Repository: Code shared by the research community for the MIMIC-III database

MIMIC Code Repository The MIMIC Code Repository is intended to be a central hub for sharing, refining, and reusing code used for analysis of the MIMIC

MIT Laboratory for Computational Physiology 1.8k Dec 26, 2022
Deep Inertial Prediction (DIPr)

Deep Inertial Prediction For more information and context related to this repo, please refer to our website. Getting Started (non Docker) Note: you wi

Arcturus Industries 12 Nov 11, 2022
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
这是一个unet-pytorch的源码,可以训练自己的模型

Unet:U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Downl

Bubbliiiing 567 Jan 05, 2023
Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Demetri Pananos 9 Oct 04, 2022
Title: Heart-Failure-Classification

This Notebook is based off an open source dataset available on where I have created models to classify patients who can potentially witness heart failure on the basis of various parameters. The best

Akarsh Singh 2 Sep 13, 2022
A toolkit for document-level event extraction, containing some SOTA model implementations

❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi

Tong Zhu(朱桐) 159 Dec 22, 2022
This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

ERFNet This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation. NEW!! New PyTorch

Edu 104 Jan 05, 2023
An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Text2Event An implementation for Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction Please contact Yaojie Lu (@

Roger 153 Jan 07, 2023
This repository contains small projects related to Neural Networks and Deep Learning in general.

ILearnDeepLearning.py Description People say that nothing develops and teaches you like getting your hands dirty. This repository contains small proje

Piotr Skalski 1.2k Dec 22, 2022