Pytorch ImageNet1k Loader with Bounding Boxes.

Last update: Oct 15, 2022

Related tags

Overview

ImageNet 1K Bounding Boxes

For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I've included the code to extract the meta-data for the bounding box, cleaning up the the downloaded stuff, and then changing ImageNet Loader to support only the images that have box annotations.

How to use:

from costum_imagenet import BackgroundForegroundImageNet
tr = trans.Compose([trans.Resize(224), trans.CenterCrop(224), trans.ToTensor(), ])
dataset = BackgroundForegroundImageNet(root='./data/imagenet/train', download=True, transform=tr)
x, b, f, y = dataset[0]
torchvision.utils.save_image(torch.stack([x, b, f]), 'test1.png')

Example:

If you set the value download=True, the bounding boxes and the indices of imagenet train split that have the bounding boxes will be downloaded. But if for some reason you want to create your own bounding boxes from the scratch, here's the steps for doing it:

Restarting from the scratch

Downloading: First download the data from here:

wget "https://image-net.org/data/bboxes_annotations.tar.gz"

Extract the File:

tar -xvf bboxes_annotations.tar.gz

Extract every subfolder:

cd bboxes_annotations
ls | grep .tar.gz | while read f ; do tar -xvf "${f}" ; done

Convert dataset to JS:

python read_xml.py

Clean the extra 50GB extracted files:

rm *.tar.gz
ls | grep "n.*" | while read f ; do rm -rf "${f}"  ; done

Get Indices that have bounding boxes:

python get_indices.py

Then simply pass the path to the files boxes.pt and indices.pt to your BackgroundForegroundImageNet constructor

dataset = BackgroundForegroundImageNet(root='.', download=False, boxes='boxes.pt', indices='indices.pt')

You might also like...

This code finds bounding box of a single human mouth.

This code finds bounding box of a single human mouth. In comparison to other face segmentation methods, it is relatively insusceptible to open mouth conditions, e.g., yawning, surgical robots, etc. The mouth coordinates are found in a more certified way using two independent algorithms. Therefore, the algorithm can be used in more sensitive applications.

4 Nov 27, 2022

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression YOLOv5 with alpha-IoU losses implemented in PyTorch. Example r

147 Dec 5, 2022

Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.

ApproxMVBB Status Build UnitTests Homepage Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in

390 Dec 31, 2022

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

Improving Object Detection by Estimating Bounding Box Quality Accurately

Pytorch ImageNet1k Loader with Bounding Boxes.

Related tags

Overview

ImageNet 1K Bounding Boxes

How to use:

Example:

Restarting from the scratch

You might also like...

This code finds bounding box of a single human mouth.

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Improving Object Detection by Estimating Bounding Box Quality Accurately

LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

An essential implementation of BYOL in PyTorch + PyTorch Lightning

RealFormer-Pytorch Implementation of RealFormer using pytorch

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

Releases(files)

files(Jan 23, 2022)

Owner

Amin Ghiasi

Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

(EI 2022) Controllable Confidence-Based Image Denoising

How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Technical experimentations to beat the stock market using deep learning :chart_with_upwards_trend:

A unet implementation for Image semantic segmentation

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

Convex optimization for fun and profit.

A Dataset for Direct Quotation Extraction and Attribution in News Articles.

This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

Pytorch reimplementation of PSM-Net: "Pyramid Stereo Matching Network"

Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

COVID-VIT: Classification of Covid-19 from CT chest images based on vision transformer models

implement of SwiftNet:Real-time Video Object Segmentation

Benchmarks for Object Detection in Aerial Images

Jremesh-tools - Blender addon for quad remeshing

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

Official Repository for our ICCV2021 paper: Continual Learning on Noisy Data Streams via Self-Purified Replay