7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Overview

kaggle-hpa-2021-7th-place-solution

Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle.

A description of the method can be found in this post in the kaggle discussion.

Dataset Preparation

Resize Images

# Resize train images to 768x768
python scripts/hap_segmenter/create_cell_mask.py resize_image \
    --input_directory data/input/hpa-single-cell-image-classification.zip/train \
    --output_directory data/input/hpa-768768.zip \
    --image_size 768
# Resize train images to 1536x1536
python scripts/hap_segmenter/create_cell_mask.py resize_image \
    --input_directory data/input/hpa-single-cell-image-classification.zip/train \
    --output_directory data/input/hpa-1536.zip \
    --image_size 1536

# Resize test images to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_image \
    --input_directory /kaggle/input/hpa-single-cell-image-classification/test \
    --output_directory data/input/hpa-768-test.zip \
    --image_size 768
# Resize test images to 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py resize_image \
    --input_directory /kaggle/input/hpa-single-cell-image-classification/test \
    --output_directory data/input/hpa-1536-test.zip \
    --image_size 1536

You can specify a directory in a zip file in the same way as a normal directory.

Download Public HPA

Download all images in kaggle_2021.tsv in this dataset, resize them into 768x768 and 1536x1536, and archive them as data/input/hpa-public-768.zip and data/input/hpa-public-1536.zip.

Create Cell Mask

# Create cell masks for the Kaggle train set with 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask \
    --input_directory data/input/hpa-1536.zip \
    --output_directory data/input/hpa-1536-mask-v2.zip \
    --label_cell_scale_factor 1.0

# Resize the masks to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-1536-mask-v2.zip \
    --output_directory data/input/hpa-768-mask-v2-from-1536.zip \
    --image_size 768

# Create cell masks for the Public HPA dataset with 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask \
    --input_directory data/input/hpa-public-1536.zip/hpa-public-1536 \
    --output_directory data/input/hpa-public-1536-mask-v2.zip \
    --label_cell_scale_factor 1.0

# Resize the masks to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-public-1536-mask-v2.zip \
    --output_directory data/input/hpa-public-768-mask-v2-from-1536.zip \
    --image_size 768

# Create cell masks for the test set with the original resolution
# Run with `--label_cell_scale_factor = 0.5` to save inference time
python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask \
    --input_directory /kaggle/input/hpa-single-cell-image-classification/test \
    --output_directory data/input/hpa-test-mask-v2.zip \
    --label_cell_scale_factor 0.5

# Resize the masks to 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-test-mask-v2.zip \
    --output_directory data/input/hpa-test-mask-v2-1536.zip \
    --image_size 1536

# Resize the masks to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-test-mask-v2.zip \
    --output_directory data/input/hpa-test-mask-v2-768.zip \
    --image_size 768

Create Input for Cell-level Classifier

# Create cell-level inputs for the Kaggle train set using 768x768 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-768768.zip \
    --cell_mask_directory data/input/hpa-768-mask-v2-from-1536.zip \
    --output_directory data/input/hpa-cell-crop-v2-192-from-768.zip \
    --image_size 192

# Create cell-level inputs for the Public HPA dataset using 768x768 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-public-768.zip \
    --cell_mask_directory data/input/hpa-public-768-mask-v2-from-1536.zip \
    --output_directory data/input/hpa-public-cell-crop-v2-192-from-768.zip \
    --image_size 192

# Create cell-level inputs for the Kaggle train set using 1536x1536 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-1536.zip \
    --cell_mask_directory data/input/hpa-1536-mask-v2.zip \
    --output_directory data/input/hpa-cell-crop-v2-192-from-1536.zip \
    --image_size 192

# Create cell-level inputs for the Public HPA dataset using 1536x1536 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-public-1536.zip \
    --cell_mask_directory data/input/hpa-public-1536-mask-v2.zip \
    --output_directory data/input/hpa-public-cell-crop-v2-192-from-1536.zip \
    --image_size 192

# Create cell-level inputs for the test set using 768x768 images as fixed scale image.
python scripts/hpa_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-768768-test.zip \
    --cell_mask_directory data/input/hpa-test-mask-v2-768.zip \
    --output_directory data/input/hpa-test-cell-crop-v2-192-from-768.zip \
    --image_size 192

# Create cell-level inputs for the test set using 1536x1536 images as fixed scale image.
python scripts/hpa_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-1536-test.zip \
    --cell_mask_directory data/input/hpa-test-mask-v2-1536.zip \
    --output_directory data/input/hpa-test-cell-crop-v2-192-from-1536.zip \
    --image_size 192

Training

# Train image-level classifier
python scripts/cam_consistency_training/run.py train \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml

# Train cell-level classifier
python scripts/cell_crop/run.py train \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

If you want to train on multiple GPUs, use a launcher like torch.distributed.launch and pass --local_rank option. You can override the fields in the config by passing an argument like field_name=${value} (e.g. fold_index=1). We trained 5 folds for all models used in the final submission pipeline. The config files are located in scripts/cam_consistency_training/configs and scripts/cell_crop/configs. We trained the models in the following order.

  1. scripts/cam_consistency_training/configs/eff-b2-focal-alpha1-cutmix-pubhpa-maskv2.yaml
  2. scripts/cam_consistency_training/configs/eff-b5-focal-alpha1-cutmix-pubhpa-maskv2.yaml
  3. scripts/cam_consistency_training/configs/eff-b7-focal-alpha1-cutmix-pubhpa-maskv2.yaml
  4. scripts/cam_consistency_training/configs/eff-b2-cutmix-pubhpa-768-to-1536.yaml
  5. Do predict_valid and concat_valid_predictions (described below) for each model and save the average of the output files under data/working/consistency_training/b2-1536-b2-b5-b7-768-avg/.
  6. scripts/cam_consistency_training/configs/eff-b2-focal-stage2-b2b2b5b7avg.yaml
  7. scripts/cell_crop/configs/resnest50-bce-from768-cutmix-softpl.yaml
  8. Do predict_valid and concat_valid_predictions for each model and save the average of the output files under data/working/image-level-and-cell-crop-both-5folds/.
  9. scripts/cam_consistency_training/configs/eff-b2-focal-stage3.yaml
  10. scripts/cam_consistency_training/configs/eff-b2-focal-stage3-cos.yaml
  11. scripts/cell_crop/configs/resnest50-bce-from768-stage3.yaml
  12. scripts/cell_crop/configs/resnest50-bce-from1536-stage3-cos.yaml

Inference

Validation Set

# Image-level classifier inference
python scripts/cam_consistency_training/run.py predict_valid \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml

# Cell-level classifier inference
python scripts/cell_crop/run.py predict_valid \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

# Concatenate the predictions for each fold to obtain the OOF prediction for the entire training data
python scripts/cam_consistency_training/run.py concat_valid_predictions \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml
python scripts/cell_crop/run.py concat_valid_predictions \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

Test Set

# Image-level classifier inference
python scripts/cam_consistency_training/run.py predict_test \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml

# Cell-level classifier inference
python scripts/cell_crop/run.py predict_test \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

# Make our final submission with post-processing
python scripts/average_predictions.py \
    --orig_size_cell_mask_directory data/input/hpa-test-mask-v2.zip \
    "data/working/consistency_training/eff-b2-focal-stage3/0" \
    "data/working/consistency_training/eff-b2-focal-stage3/1" \
    "data/working/consistency_training/eff-b2-focal-stage3/2" \
    "data/working/consistency_training/eff-b2-focal-stage3/3" \
    "data/working/consistency_training/eff-b2-focal-stage3/4" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/0" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/1" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/2" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/3" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/4" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/0" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/1" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/2" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/3" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/4" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/0" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/1" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/2" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/3" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/4" \
    --edge_area_threshold 80000 --center_area_threshold 32000

Use the code on Kaggle Notebook

Use docker to zip the source code and the wheels of the dependencies and upload them as a dataset.

docker run --rm -it -v /path/to/this/repo:/tmp/workspace -w /tmp/workspace/ gcr.io/kaggle-images/python bash ./build_zip.sh

In Kaggle Notebook, when you copy the code as shown below, you can run it the same way as your local environment.

# Make a working directory
!mkdir -p /kaggle/tmp

# Change the current directory
cd /kaggle/tmp

# Copy source code from the uploaded dataset
!cp -r /kaggle/input/<your-dataset-name>/* .

# You can use it as well as local environment
!python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask ...
Deep Learning Specialization by Andrew Ng, deeplearning.ai.

Deep Learning Specialization on Coursera Master Deep Learning, and Break into AI This is my personal projects for the course. The course covers deep l

Engen 1.5k Jan 07, 2023
PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning PyTorch code for our ACL 2020 paper "MART: Memory-Augmented Recur

Jie Lei 雷杰 151 Jan 06, 2023
Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

Kai Zhang 1.2k Dec 29, 2022
MERLOT: Multimodal Neural Script Knowledge Models

merlot MERLOT: Multimodal Neural Script Knowledge Models MERLOT is a model for learning what we are calling "neural script knowledge" -- representatio

Rowan Zellers 190 Dec 22, 2022
VM3000 Microphones

VM3000-Microphones This project was completed by Ricky Leman under the supervision of Dr Ben Travaglione and Professor Melinda Hodkiewicz as part of t

UWA System Health Lab 0 Jun 04, 2021
Code to produce syntactic representations that can be used to study syntax processing in the human brain

Can fMRI reveal the representation of syntactic structure in the brain? The code base for our paper on understanding syntactic representations in the

Aniketh Janardhan Reddy 4 Dec 18, 2022
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition

🎵 MuSiQue: Multi-hop Questions via Single-hop Question Composition This is the repository for our paper "MuSiQue: Multi-hop Questions via Single-hop

21 Jan 02, 2023
Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

Kushal Shingote 2 Feb 10, 2022
A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

jedibobo 3 Dec 28, 2022
Highway networks implemented in PyTorch.

PyTorch Highway Networks Highway networks implemented in PyTorch. Just the MNIST example from PyTorch hacked to work with Highway layers. Todo Make th

Conner Vercellino 56 Dec 14, 2022
Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC 6 Nov 29, 2022
FairMOT for Multi-Class MOT using YOLOX as Detector

FairMOT-X Project Overview FairMOT-X is a multi-class multi object tracker, which has been tailored for training on the BDD100K MOT Dataset. It makes

Jonathan Tan 33 Dec 28, 2022
Code I use to automatically update my videos' metadata on YouTube

mCodingYouTube This repository contains the code I use to automatically update my videos' metadata on YouTube, including: titles, descriptions, tags,

James Murphy 19 Oct 07, 2022
Py4fi2nd - Jupyter Notebooks and code for Python for Finance (2nd ed., O'Reilly) by Yves Hilpisch.

Python for Finance (2nd ed., O'Reilly) This repository provides all Python codes and Jupyter Notebooks of the book Python for Finance -- Mastering Dat

Yves Hilpisch 1k Jan 05, 2023
The official repository for "Score Transformer: Generating Musical Scores from Note-level Representation" (MMAsia '21)

Score Transformer This is the official repository for "Score Transformer": Score Transformer: Generating Musical Scores from Note-level Representation

22 Dec 22, 2022
Semantic Segmentation in Pytorch

PyTorch Semantic Segmentation Introduction This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to

Hengshuang Zhao 1.2k Jan 01, 2023
Large-scale Hyperspectral Image Clustering Using Contrastive Learning, CIKM 21 Workshop

Spectral-spatial contrastive clustering (SSCC) Yaoming Cai, Yan Liu, Zijia Zhang, Zhihua Cai, and Xiaobo Liu, Large-scale Hyperspectral Image Clusteri

Yaoming Cai 4 Nov 02, 2022
STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

STARCH (Storm Tracking And Regional CHaracterization) STARCH computes regional extreme storm physical and moisture balance characteristics based on sp

Onosama 7 Oct 20, 2022
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 01, 2022
[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver

Shape As Points (SAP) Paper | Project Page | Short Video (6 min) | Long Video (12 min) This repository contains the implementation of the paper: Shape

394 Dec 30, 2022