Code for the paper TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

Overview

TestRank in Pytorch

Code for the paper TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks by Yu Li, Min Li, Qiuxia Lai, Yannan Liu, and Qiang Xu.

If you find this repository useful for your work, please consider citing it as follows:

@article{yu2021testrank,
  title={TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks},
  author={Yu Li, Min Li, Qiuxia Lai, Yannan Liu, and Qiang Xu},
  journal={NeurIPS},
  year={2021}
}

1. Setup

Install dependencies

conda env create -f environment.yml

Please run the code on GPU.

2. Runing

There are mainly three steps involved:

  • Prepare the DL models to be tested
  • Prepare the unsupervised BYOL feature extractor
  • Launch a specific test input prioritization technique

We illustrate these steps as the following.

2.1. Download the Pre-trained DL model under test

Please download the classifiers to corresponding folder ./checkpoint/{dataset}/ckpt_bias/

If you want to train your own classifiers, please refer to the Training part.

2.2. Download the Feature extractor

We papare pretrained feature extractor for the each (e.g. CIFAR-10, SVHN, STL10) dataset. Please put the downloaded file in the "./ckpt_byol/" folder.

If you want to train your own classifiers, please refer to the Training part.

2.3. Perform Test Selection

Call the 'run.sh' file with argument 'selection':

  ./run.sh selection

Configure your run.sh follow the discription below

  python selection.py \
              --dataset $DATASET \                   # specify the dataset to use
              --manualSeed ${RANDOM_SEED} \          # random seed
              --model2test_arch $MODEL2TEST \        # architecture of the model under test (e.g. resnet18)
              --model2test_path $MODEL2TESTPATH \    # the path storing the model weights 
              --model_number $MODEL_NO \             # which model to test, model 0, 1, or 2?
              --save_path ${save_path} \             # The result will be stored in here
              --data_path ${DATA_ROOT} \             # Dataset root path
              --graph_nn \                           # use graph neural network in testrank
              --feature_extractor_id ${feature_extractor_id} \ # type of feature extractor, 0: BYOL model, 1: the model under test
              --no_neighbors ${no_neighbors} \       # number of neighbors in to constract graph
              --learn_mixed                          # use mlp to combine intrinsic and contextual attributes; otherwise they are brute force combined (multiplication two scores)
              --baseline_gini                        # Use certain baseline method to perform selection, otherwise leave it blank
  • The result is stored in '{save_path}/{date}/{dataset}_{model}/xxx_result.csv' in where xxx stands for the selection method used (e.g. for testrank, the file would be gnn_result.csv)

  • The TRC value is in the last column, and the forth column shows the corresponding budget in percent.

  • To compare with baselines, please specify the corresponding baseline method (e.g. baseline_gini, baseline_uncertainty, baseline_dsa, baseline_mcp):

  • To evaluate different models, change the MODEL_NO to the corresponding model: [0, 1, 2]

3. Training

3.1. Train classifier

If you want to train your own DL model instead of using the pretrained ones, run this command:

./run.sh trainm
  • The trained model will be stored in path './checkpoint/dataset/ckpt_bias/*'.

  • Each model will be assigned with a unique ID (e.g. 0, 1, 2).

  • The code used to train the model are resides in the train_classifier.py file. If you want to change the dataset or model architecture, please modify 'DATASET=dataset_name' or 'MODEL=name'with the desired ones in the run.sh file.

3.2 Train BYOL Feature Extractor

Please refer to this code.

4. Contact

If there are any questions, feel free to send a message to [email protected]

Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Jan 08, 2023
TensorFlow code and pre-trained models for BERT

BERT ***** New March 11th, 2020: Smaller BERT Models ***** This is a release of 24 smaller BERT models (English only, uncased, trained with WordPiece

Google Research 32.9k Jan 08, 2023
Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets

Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets What is LASSL • How to Use What is LASSL LASSL은 LAnguage Semi-Super

LASSL: LAnguage Self-Supervised Learning 116 Dec 27, 2022
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

Salesforce 564 Jan 08, 2023
Source code for the paper "TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations"

TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations Created by Jiahao Pang, Duanshun Li, and Dong Tian from InterDigital In

InterDigital 21 Dec 29, 2022
An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

Facebook Research 409 Oct 28, 2022
A Transformer Implementation that is easy to understand and customizable.

Simple Transformer I've written a series of articles on the transformer architecture and language models on Medium. This repository contains an implem

Naoki Shibuya 4 Jan 20, 2022
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

Alibaba 1.4k Jan 04, 2023
PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. It provides easy-to-use, low-overhead, first-class Python wrappers for t

922 Dec 31, 2022
Yes it's true :broken_heart:

Information WARNING: No longer hosted If you would like to be on this repo's readme simply fork or star it! Forks 1 - Flowzii 2 - Errorcrafter 3 - vk-

Dropout 66 Dec 31, 2022
Library for Russian imprecise rhymes generation

TOM RHYMER Library for Russian imprecise rhymes generation. Quick Start Generate rhymes by any given rhyme scheme (aabb, abab, aaccbb, etc ...): from

Alexey Karnachev 6 Oct 18, 2022
RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

RuCLIPtiny Zero-shot image classification model for Russian language RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network

Shahmatov Arseniy 26 Sep 20, 2022
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

Hugging Face 77.1k Dec 31, 2022
SimCSE: Simple Contrastive Learning of Sentence Embeddings

SimCSE: Simple Contrastive Learning of Sentence Embeddings This repository contains the code and pre-trained models for our paper SimCSE: Simple Contr

Princeton Natural Language Processing 2.5k Jan 07, 2023
A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

Won Joon Yoo 335 Jan 04, 2023
Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Lime Comparing deep contextualized model for sentences highlighting task. In addition, take the classic explanation model "LIME" with bert-base model

JHJu 2 Jan 18, 2022
TPlinker for NER 中文/英文命名实体识别

本项目是参考 TPLinker 中HandshakingTagging思想,将TPLinker由原来的关系抽取(RE)模型修改为命名实体识别(NER)模型。

GodK 113 Dec 28, 2022
Mastering Transformers, published by Packt

Mastering Transformers This is the code repository for Mastering Transformers, published by Packt. Build state-of-the-art models from scratch with adv

Packt 195 Jan 01, 2023
Reformer, the efficient Transformer, in Pytorch

Reformer, the Efficient Transformer, in Pytorch This is a Pytorch implementation of Reformer https://openreview.net/pdf?id=rkgNKkHtvB It includes LSH

Phil Wang 1.8k Dec 30, 2022
Modified GPT using average pooling to reduce the softmax attention memory constraints.

NLP-GPT-Upsampling This repository contains an implementation of Open AI's GPT Model. In particular, this implementation takes inspiration from the Ny

WD 1 Dec 03, 2021