Learning to compose soft prompts for compositional zero-shot learning.

Overview

Compositional Soft Prompting (CSP)

Compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositionality of large-scale pretrained vision-language models (VLMs) without the overhead of fine-tuning the entire model.

Reference Paper: Learning to Compose Soft Prompts for Compositional Zero-Shot Learning

alt text

If you find CSP helpful, please cite our paper:

@article{csp2022,
  author = {Nayak, Nihal V. and Yu, Peilin and Bach, Stephen H.},
  title = {Learning to Compose Soft Prompts for Compositional Zero-Shot Learning},
  volume = {arXiv:2204.03574 [cs.LG]},
  year = {2022},
}

Setup

conda create --name clip python=3.7
conda activate clip
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
pip3 install ftfy regex tqdm scipy pandas
pip3 install git+https://github.com/openai/CLIP.git

Alternatively, you can use pip install -r requirements.txt to install all the dependencies.

Download Dataset

We experiment with three datasets: MIT-States, UT-Zappos, and C-GQA.

sh download_data.sh

If you already have setup the datasets, you can use symlink and ensure the following paths exist: data/<dataset> where <datasets> = {'mit-states', 'ut-zappos', 'cgqa'}.

Training

python -u train.py \
  --dataset mit-states \
  --model ViT-L/14 \
  --experiment_name csp \
  --seed 0 \
  --epochs 20 \
  --lr 5e-05 \
  --attr_dropout 0.3 \
  --weight_decay 0.00001 \
  --train_batch_size 64 \
  --gradient_accumulation_steps 2 \
  --context_length 8 \
  --save_path data/model/mit-states/sample_model \
  --save_every_n 1

You can replace --dataset with {mit-states, ut-zappos, cgqa}. The best hyperparameters are included in the paper.

Evaluation

We evaluate our models in two settings: closed-world and open-world.

Closed-World Evaluation

python -u evaluate.py \
  --dataset mit-states \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_20.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 16 \
  --experiment_name csp

Open-World Evaluation

For our open-world evaluation, we compute the feasbility calibration and then evaluate on the dataset.

Feasibility Calibration

We use GloVe embeddings to compute the similarities between objects and attributes. Download the GloVe embeddings in the data directory:

cd data
wget https://nlp.stanford.edu/data/glove.6B.zip

Move glove.6B.300d.txt into data/glove.6B.300d.txt.

To compute feasibility calibration for each dataset, run the following command:

python -u datasets/feasibility.py --dataset mit-states

The feasibility similarities are saved at data/feasibility_<dataset>.pt.

Evaluation

The open-world evaluation with the thresholds (feasibility calibration).

python -u evaluate.py \
  --dataset mit-states \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_5.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 256 \
  --experiment_name czsl \
  --threshold <threshold> \
  --open_world

If <threshold> is None, then the model picks the best threshold on the validation set. We use the following thresholds:

Dataset Threshold
mit-states 0.4069159426
ut-zappos 0.5299109123
cgqa 0.49937106273612186

Note: We use 256GB of cpu memory to evaluate cgqa.

Generalization to Higher-Order Compositions

Evaluate the trained CSP vocabulary on the new AAO-MIT-States dataset.

python aao/evaluate_att_att_obj.py \
  --experiment_name csp \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_20.pt

We thank Andrew Delworth and Elise Carman for helping us annotate this dataset.

Generalization to Mixed Pretrained and Fine-Tuned Vocabulary

Ablation experiment to train and evaluate CSP with reduced fine-tuned vocabulary. We run experiment on the ut-zappos dataset.

Training

python -u mix/mix_train.py \
  --dataset ut-zappos \
  --model ViT-L/14 \
  --experiment_name mix_csp \
  --seed 0 \
  --epochs 20 \
  --lr 5e-04 \
  --attr_dropout 0.2 \
  --weight_decay 0.00001 \
  --train_batch_size 64 \
  --context_length 8 \
  --save_path data/model/ut-zappos/mix_train_model_0.25 \
  --save_every_n 5 \
  --attr_keep_ratio 0.25 \
  --gradient_accumulation_steps 2

We change the --attr_keep_ratio to {0.25, 0.50, 0.75}.

Evaluation

python -u mix/evaluate_mix_train.py \
  --dataset ut-zappos \
  --soft_embeddings data/model/ut-zappos/mix_train_model_0.25/soft_embeddings.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 256 \
  --experiment_name csp

Credits

The project uses openly available model, code, and datasets. Please see the credits.

Owner
Bats Research
Bats Research
The Linux defender anti-virus software ported to work on CentOS Linux.

By: Seanpm2001, Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans |

Sean P. Myrick V19.1.7.2 2 Sep 12, 2022
Facebook Fast Cracking Tool With Python

Pro-Crack Facebook Fast Cracking Tool This is a multi-password‌ cracking tool that can help you hack facebook accounts very quickly Installation On Te

ReD H4CkeR 5 Feb 19, 2022
Script checks provided domains for log4j vulnerability

log4j Script checks provided domains for log4j vulnerability. A token is created with canarytokens.org and passed as header at request for a single do

Matthias Nehls 2 Dec 12, 2021
A Safer PoC for CVE-2022-22965 (Spring4Shell)

Safer_PoC_CVE-2022-22965 A Safer PoC for CVE-2022-22965 (Spring4Shell) Functionality Creates a file called CVE_2022-22965_exploited.txt in the tomcat

Colin Cowie 46 Nov 12, 2022
Update of uncaptcha2 from 2019

YouTube Video Proof of Concept I created a new YouTube Video with technical Explanation for breaking Google's Audio reCAPTCHAs: Click on the image bel

Nikolai Tschacher 153 Dec 20, 2022
Utility for Extracting all passwords from ConnectWise Automate

CWA Password Extractor Utility for Extracting all passwords from ConnectWise Automate (E.g. while migrating to a new system). Outputs a csv file with

Matthew Kyles 1 Dec 09, 2021
GDID (Google Dorks for Information Disclosure)

GDID (Google Dorks for Information Disclosure) Script made for your recon automation in Bug Bounty or Pentest. It will help you to find Information Di

Nischacid 5 Mar 10, 2022
IDA loader for Apple's iBoot, SecureROM and AVPBooter

IDA iBoot Loader IDA loader for Apple's iBoot, SecureROM and AVPBooter Installation Copy iboot-loader.py to the loaders folder in IDA directory. Credi

matteyeux 74 Dec 23, 2022
edgedressing leverages a Windows "feature" in order to force a target's Edge browser to open. This browser is then directed to a URL of choice.

edgedressing One day while experimenting with airpwn-ng, I noticed unexpected GET requests on the target node. The node in question happened to be a W

stryngs 43 Dec 23, 2022
Passphrase-wordlist - Shameless clone of passphrase wordlist

This repository is NOT official -- the original repository is located on GitLab

Jeff McJunkin 2 Feb 05, 2022
Auerswald COMpact 8.0B Backdoors exploit

CVE-2021-40859 Auerswald COMpact 8.0B Backdoors exploit About Backdoors were discovered in Auerswald COMpact 5500R 7.8A and 8.0B devices, that allow a

6 Sep 22, 2022
Signatures and IoCs from public Volexity blog posts.

threat-intel This repository contains IoCs related to Volexity public threat intelligence blog posts. They are organised by year, and within each year

Volexity 130 Dec 29, 2022
SPV SecurePasswordVerification

SPV SecurePasswordVerification Its is python module for doing a secure password verification without sharing the password directly. Features The passw

Merwin 1 Feb 12, 2022
This is a simple tool to create ZIP payloads using a provided wordlist for the symlink attack (present in some file upload vulnerabilities)

zip-symlink-payload-creator This is a simple tool to create ZIP payloads using a provided wordlist for the symlink attack (present in some file upload

stark0de 6 Aug 18, 2022
A kAFL based hypervisor fuzzer which fully supports nested VMs

hAFL2 hAFL2 is a kAFL-based hypervisor fuzzer. It is the first open-source fuzzer which is able to target hypervisors natively (including Hyper-V), as

SafeBreach Labs 115 Dec 07, 2022
These are Simple python scripts to test/scan your network

Disclaimer This tool is for Educational purpose only. We do not promote or encourage any illegal activities. Summary These are Simple python scripts t

Varun Jagtap 5 Oct 08, 2022
This program is a WiFi cracker, you can test many passwords for a desired wifi to find the wifi password!

WiFi_Cracker About the Program: This program is a WiFi cracker! Just run code and select a desired wifi to start cracking 💣 Note: you can use this pa

Sina.f 13 Dec 08, 2022
A python package with tools to read and postprocess the output of the channel DNS-solver (davecats/channel), as well as its associated postprocessing tools.

Python tools for davecats/channel A python package with tools to read and postprocess the output of the channel dns solver, as well as its associated

Andrea Andreolli 1 Dec 13, 2021
Something I built to test for Log4J vulnerabilities on customer networks.

Log4J-Scanner Something I built to test for Log4J vulnerabilities on customer networks. I'm not responsible if your computer blows up, catches fire or

1 Dec 20, 2021
xp_CAPTCHA(白嫖版) burp 验证码 识别 burp插件

xp_CAPTCHA(白嫖版) 说明 xp_CAPTCHA (白嫖版) 验证码识别 burp插件 安装 需要python3 小于3.7的版本 安装 muggle_ocr 模块(大概400M左右) python3 -m pip install -i http://mirrors.aliyun.com/

算命縖子 588 Jan 09, 2023