Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Last update: Jul 15, 2022

Overview

Downloading our datasets

https://drive.google.com/file/d/1CfomsX6qmdCLfFutptqrQnp1RlaJEpXh/view?usp=sharing
extract and put the /data folder under the same root as /src

Dataset structure

Each dataset may have several subdatasets (most of them only have one)

|
   
   
    
    
    |dataset/
        -|
    
    
     
     
            -|
     
     
      
      
            -|
      
      
       
       
        -|
       
       
         ... |pickled/ -|tensor_dict.pt

The pickle file tensor_dict.pt has the following format:

{
    'subdataset_1':{
        'label_1':{
            'image_tensors':np.array((N,3,224,224)), # N: image number
            'input_ids':np.array(S), # S: token length of the filled template text
            'attention_masks':np.array(S),
            'template_input_ids':np.array(S_), # S_: token length of the un-filled template text
            'template_attention_masks':np.array(S_),
        },
        'label_2':{
            ...
        }
    },
    ...
}

ABO dataset contains an additional label_to_text.json file, which provides text template for each subdataset and label.

A list of available datasets and subdatasets

Dataset	dataset name (-i)	subdataset name (-d)
Clevr Counting	`ClevrCounting`	`counting`
Amazon Berkeley Objects (ABO)	`ABO`	`material`,`color`
Caltech-UCSD Birds 200 (CUB)	`CUB`	`classification`
Fungi	`Fungi`	`classification`
Mini-imagenet	`mini`	`classification`

Training with provided datasets

run.sh provided example code for performing training and meta-testing on our datasets.

Output format

Each model checkpoint dir contains two files:

step1.ckpt: model checkpoint after training phase
dev_test_results.json: scores on each task configuration on dev and test set during meta-testing

Loading checkpoint

Here is an example snippet for loading step1.ckpt from multitask-finetuning/classical-finetuning/zeroshot models:

/step1.ckpt")">

    model = MultitaskFinetuneCLIP()
    model = model.load_from_checkpoint(checkpoint_path="
    
    
     
     /step1.ckpt")

Here is an example snippet for loading step1.ckpt from fomaml models:

/step1.ckpt"))">

    model = LightningCLIP()
    model = l2l.algorithms.MAML(model, lr=1e-5 first_order=True)
    model.load_state_dict(torch.load("
    
    
     
     /step1.ckpt"))

Training with custom datasets

preprocess dataset

put your new dataset in the same format as provided dataset into data/
Specify template_function or the path to label_to_text json file (an example file can be found in /data/ABO/label_to_text.json) at line 350 and 355 in data.py
preprocess.sh provides an example of running data.py to create pickle file for your new dataset
add your dataset into construct_dataset(): line 77 in train.py and line 80 in train_MAML.py

train

modify run.sh to train and meta-test on your own dataset
refer to train.py and train_MAML.py for default and tuning hyperparameters for each algorithm

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Related tags

Overview

Downloading our datasets

Dataset structure

A list of available datasets and subdatasets

Training with provided datasets

Output format

Loading checkpoint

Training with custom datasets

preprocess dataset

train

Citation

Owner

Zhenhailong Wang

📝An easy-to-use package to restore punctuation of the text.

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Top2Vec is an algorithm for topic modeling and semantic search.

Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

Textlesslib - Library for Textless Spoken Language Processing

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

NLP Overview

Knowledge Graph,Question Answering System，基于知识图谱和向量检索的医疗诊断问答系统

Command Line Text-To-Speech using Google TTS

Treemap visualisation of Maya scene files

NLP project that works with news (NER, context generation, news trend analytics)

Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Stand-alone language identification system

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

Text Normalization（文本正则化）