Exploiting a Zoo of Checkpoints for Unseen Tasks

Overview

Exploiting a Zoo of Checkpoints for Unseen Tasks

                               

This repo includes code to reproduce all results in the above Neurips paper, authored by Jiaji Huang, Qiang Qiu and Kenneth Church.

Dependencies

We used python 3.8.5, but other versions close to that should also work. Install all required packages by

pip install --upgrade pip
pip install -r requirements.txt

We used cuda 10.2.89, but any version that meets pytorch's requirement should also work.

Highlight of Results

We highlight some major results, so that readers do not have to read the paper to grasp the main ideas. Concisely, the paper tries to answer the question:

"Can we use a checkpoint zoo to build something that better adapts to unseen tasks?"

To answer the question, first we need to understand the geometry of a space of tasks.

Characterize the Task Space

In the paper, we model the tasks as following a Gaussian process. Its covariance is computed by applying kernel alignment to extracted features. The features are obtained by inputting probe data into checkpoints, each trained for a task. For example, using 34 checkpoints from Huggingface models, we can estimate the 34x34 covariance (of their corresponding tasks).

To reproduce the above figure, refer to LMs/README.md.

Exploit the Task Space

We hypothesize that representative tasks are more generalizable to new tasks. This, of course, needs a rigorious mathematical proof. But empirically we find it is true, as indicated by the experiments on NLP and vision tasks.

So, how to identify reprentative tasks? They are supposed to convey the most information about the rest of the task space. We formulate the problem into a Max-Mutual-Information (MMI) objective. The solver takes the covariance as input, and greedily picks reprentative tasks.

Linguistic Tasks

Using the 34x34 covariance matrix, we can identify that the 5 most representative tasks are those corresponding to roberta-base, distilbert-base-uncased, t5-base, bert-base-cased and bart-large. Combining these checkpoints yields superior results on 8 new linguistic tasks, e.g., below is an example of chunking task.

To reproduce full results, check LMs/README.md for details.

Computer Vision Tasks

The observation holds for vision tasks too. Below is an experiment set up on cifar100. MMI shows steady gain over random selection, and outperforms another baseline.

To reproduce all results, check vision/README.md for details.

Additional Comments

Note: This project requires running many small jobs. So it will be very useful if you have a cluster powered by slurm, which can launch jobs in parallel. In the job-launching scripts, you can see multiple commands like

sbatch -p $partition --gres=gpu:1 --wrap "python run.py" -o $job_log_path

If you do not have such a cluster, just use

python run.py > $job_log_path

instead.

Owner
Baidu Research
Baidu Research
Baidu Research
Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

YOLOv4-large This is the implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork. YOLOv4-CSP YOLOv4-tiny YOLOv4-

Kin-Yiu, Wong 2k Jan 02, 2023
Log4j JNDI inj. vuln scanner

Log-4-JAM - Log 4 Just Another Mess Log4j JNDI inj. vuln scanner Requirements pip3 install requests_toolbelt Usage # make sure target list has http/ht

Ashish Kunwar 66 Nov 09, 2022
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Advanced Image Manipulation Lab @ Samsung AI Center Moscow 4.7k Dec 31, 2022
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the w

Yuval Nirkin 182 Dec 14, 2022
YOLOv2 in PyTorch

YOLOv2 in PyTorch NOTE: This project is no longer maintained and may not compatible with the newest pytorch (after 0.4.0). This is a PyTorch implement

Long Chen 1.5k Jan 02, 2023
Rohit Ingole 2 Mar 24, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
(NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductive few-shot classification"

SSR (NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductivefew-shot classification" [Paper] [Project webpage]

xshen 29 Dec 06, 2022
Companion repo of the UCC 2021 paper "Predictive Auto-scaling with OpenStack Monasca"

Predictive Auto-scaling with OpenStack Monasca Giacomo Lanciano*, Filippo Galli, Tommaso Cucinotta, Davide Bacciu, Andrea Passarella 2021 IEEE/ACM 14t

Giacomo Lanciano 0 Dec 07, 2022
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

9 Feb 23, 2022
Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

Net2Net Code accompanying the NeurIPS 2020 oral paper Network-to-Network Translation with Conditional Invertible Neural Networks Robin Rombach*, Patri

CompVis Heidelberg 206 Dec 20, 2022
Source code for 2021 ICCV paper "In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces"

In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces This is the PyTorch implementation for 2021 ICCV paper "In-the-Wild Single C

27 Dec 06, 2022
Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Transformer Encoder Reasoning Network Code for the cross-modal visual-linguistic retrieval method from "Transformer Reasoning Network for Image-Text M

Nicola Messina 53 Dec 30, 2022
Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

A project for counting vehicles using YOLOv4 + DeepSORT + Flask + Ngrok

Duong Tran Thanh 37 Dec 16, 2022
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 09, 2022
Mixed Neural Likelihood Estimation for models of decision-making

Mixed neural likelihood estimation for models of decision-making Mixed neural likelihood estimation (MNLE) enables Bayesian parameter inference for mo

mackelab 9 Dec 22, 2022
Pytorch implementation of BRECQ, ICLR 2021

BRECQ Pytorch implementation of BRECQ, ICLR 2021 @inproceedings{ li&gong2021brecq, title={BRECQ: Pushing the Limit of Post-Training Quantization by Bl

Yuhang Li 148 Dec 28, 2022
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022
Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

何森 50 Sep 20, 2022