Learning to compose soft prompts for compositional zero-shot learning.

Overview

Compositional Soft Prompting (CSP)

Compositional soft prompting (CSP), a parameter-efficient learning technique to improve the zero-shot compositionality of large-scale pretrained vision-language models (VLMs) without the overhead of fine-tuning the entire model.

Reference Paper: Learning to Compose Soft Prompts for Compositional Zero-Shot Learning

alt text

If you find CSP helpful, please cite our paper:

@article{csp2022,
  author = {Nayak, Nihal V. and Yu, Peilin and Bach, Stephen H.},
  title = {Learning to Compose Soft Prompts for Compositional Zero-Shot Learning},
  volume = {arXiv:2204.03574 [cs.LG]},
  year = {2022},
}

Setup

conda create --name clip python=3.7
conda activate clip
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
pip3 install ftfy regex tqdm scipy pandas
pip3 install git+https://github.com/openai/CLIP.git

Alternatively, you can use pip install -r requirements.txt to install all the dependencies.

Download Dataset

We experiment with three datasets: MIT-States, UT-Zappos, and C-GQA.

sh download_data.sh

If you already have setup the datasets, you can use symlink and ensure the following paths exist: data/<dataset> where <datasets> = {'mit-states', 'ut-zappos', 'cgqa'}.

Training

python -u train.py \
  --dataset mit-states \
  --model ViT-L/14 \
  --experiment_name csp \
  --seed 0 \
  --epochs 20 \
  --lr 5e-05 \
  --attr_dropout 0.3 \
  --weight_decay 0.00001 \
  --train_batch_size 64 \
  --gradient_accumulation_steps 2 \
  --context_length 8 \
  --save_path data/model/mit-states/sample_model \
  --save_every_n 1

You can replace --dataset with {mit-states, ut-zappos, cgqa}. The best hyperparameters are included in the paper.

Evaluation

We evaluate our models in two settings: closed-world and open-world.

Closed-World Evaluation

python -u evaluate.py \
  --dataset mit-states \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_20.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 16 \
  --experiment_name csp

Open-World Evaluation

For our open-world evaluation, we compute the feasbility calibration and then evaluate on the dataset.

Feasibility Calibration

We use GloVe embeddings to compute the similarities between objects and attributes. Download the GloVe embeddings in the data directory:

cd data
wget https://nlp.stanford.edu/data/glove.6B.zip

Move glove.6B.300d.txt into data/glove.6B.300d.txt.

To compute feasibility calibration for each dataset, run the following command:

python -u datasets/feasibility.py --dataset mit-states

The feasibility similarities are saved at data/feasibility_<dataset>.pt.

Evaluation

The open-world evaluation with the thresholds (feasibility calibration).

python -u evaluate.py \
  --dataset mit-states \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_5.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 256 \
  --experiment_name czsl \
  --threshold <threshold> \
  --open_world

If <threshold> is None, then the model picks the best threshold on the validation set. We use the following thresholds:

Dataset Threshold
mit-states 0.4069159426
ut-zappos 0.5299109123
cgqa 0.49937106273612186

Note: We use 256GB of cpu memory to evaluate cgqa.

Generalization to Higher-Order Compositions

Evaluate the trained CSP vocabulary on the new AAO-MIT-States dataset.

python aao/evaluate_att_att_obj.py \
  --experiment_name csp \
  --soft_embeddings data/model/mit-states/sample_model/soft_embeddings_epoch_20.pt

We thank Andrew Delworth and Elise Carman for helping us annotate this dataset.

Generalization to Mixed Pretrained and Fine-Tuned Vocabulary

Ablation experiment to train and evaluate CSP with reduced fine-tuned vocabulary. We run experiment on the ut-zappos dataset.

Training

python -u mix/mix_train.py \
  --dataset ut-zappos \
  --model ViT-L/14 \
  --experiment_name mix_csp \
  --seed 0 \
  --epochs 20 \
  --lr 5e-04 \
  --attr_dropout 0.2 \
  --weight_decay 0.00001 \
  --train_batch_size 64 \
  --context_length 8 \
  --save_path data/model/ut-zappos/mix_train_model_0.25 \
  --save_every_n 5 \
  --attr_keep_ratio 0.25 \
  --gradient_accumulation_steps 2

We change the --attr_keep_ratio to {0.25, 0.50, 0.75}.

Evaluation

python -u mix/evaluate_mix_train.py \
  --dataset ut-zappos \
  --soft_embeddings data/model/ut-zappos/mix_train_model_0.25/soft_embeddings.pt \
  --context_length 16 \
  --text_encoder_batch_size 36 \
  --eval_batch_size 256 \
  --experiment_name csp

Credits

The project uses openly available model, code, and datasets. Please see the credits.

Owner
Bats Research
Bats Research
pwncat module that automatically exploits CVE-2021-4034 (pwnkit)

pwncat_pwnkit Introduction The purpose of this module is to attempt to exploit CVE-2021-4034 (pwnkit) on a target when using pwncat. There is no need

Dana Epp 33 Jul 01, 2022
Python program that generates secure passwords.

Python program that generates secure passwords. The user has the option to select the length of the password, amount of passwords,

4 Dec 07, 2021
HashDB API hash lookup plugin for IDA Pro

HashDB IDA Plugin Malware string hash lookup plugin for IDA Pro. This plugin connects to the OALABS HashDB Lookup Service. Adding New Hash Algorithms

OALabs 237 Dec 21, 2022
Open-source keylogger write in python

Python open-source keylogger Language Python open-source keylogger using pynput module Using Install dependences in archive setup.py or install.sh in

Dio brando 4 Jan 15, 2022
Automatic ProxyShell Exploit

proxyshell-auto usage: proxyshell.py [-h] -t T Automatic Exploit ProxyShell optional arguments: -h, --help show this help message and exit -t T

lulz 93 Jan 05, 2023
The next level Python obfuscator, nearly impossible to deobfuscate.

🐸 Kramer 🐸 Kramer is a next level obfuscation tool written in Python3 allowing you to obfuscate your Python3 code easily and securely. It uses Berse

Billy 114 Dec 26, 2022
A gui application used for network reconnaissance while pentesting

netrecon A gui application used for network reconnaissance while pentesting

Krisna Pranav 4 Sep 03, 2022
Tool ini berfungsi untuk membuat virus secara instan

vbug (ID) Tool ini berfungsi untuk membuat virus secara instan. Dengan begitu pengguna vbug maker dapat menggunakannya dengan mudah dan cepat. Di dala

OneTXz 3 Jun 05, 2022
S2-061 的payload,以及对应简单的PoC/Exp

S2-061 脚本皆根据vulhub的struts2-059/061漏洞测试环境来写的,不具普遍性,还望大佬多多指教 struts2-061-poc.py(可执行简单系统命令) 用法:python struts2-061-poc.py http://ip:port command 例子:python

dreamer 46 Oct 20, 2022
Natas teaches the basics of serverside web-security.

over-the-wire-natas Natas teaches the basics of serverside web-security. Each level of natas consists of its own website located at http://natasX.nata

Siddhant Chouhan 1 Nov 27, 2021
Log4j exploit catcher, detect Log4Shell exploits and try to get payloads.

log4j_catcher Log4j exploit catcher, detect Log4Shell exploits and try to get payloads. This is a basic python server that listen on a port and logs i

EntropyQueen 17 Dec 20, 2021
CloakifyFactory & the Cloakify Toolset - Data Exfiltration & Infiltration In Plain Sight;

CloakifyFactory CloakifyFactory & the Cloakify Toolset - Data Exfiltration & Infiltration In Plain Sight; Evade DLP/MLS Devices; Social Engineering of

3 Oct 18, 2022
FBGen is simple facebook user based wordlist generator using Username/ID and cookie.

FBGen is simple facebook user based wordlist generator using Username/ID and cookie.

2 Jul 20, 2022
This repository consists of the python scripts for execution and automation of vivid tasks.

Scripting.py is a repository being maintained to keep log of the python scripts that I create for automating and executing some of my boring manual task.

Prakriti Regmi 1 Feb 07, 2022
The probability of having the password you want in the PassMaker is +90%!!

PasswordMaker Strong listing password Introduction The probability of having the password you want in the tool is +90%!! How to Install Open the termi

MasterBurnt 4 Sep 05, 2021
Tinyman exploit finder - Tinyman exploit finder for python

tinyman_exploit_finder There was a big tinyman exploit. You can read about it he

fish.exe 9 Dec 27, 2022
This is a js front-end encryption blasting account and password tools

Author:0xAXSDD By Gamma安全实验室 version:1.0 explain:这是一款用户绕过前端js加密进行密码爆破的工具,你无需在意js加密的细节,只需要输入你想要爆破url,以及username输入框的classname,password输入框的clas

75 Nov 25, 2022
Discord Token Stealer Malware Protection

TokenGuard TokenGuard, protect your account, prevent token steal. Totally free and open source Discord Server: https://discord.gg/EmwfaGuBE8 Source Co

10 Nov 23, 2022
DoSer.py - Simple DoSer in Python

DoSer.py - Simple DoSer in Python What is DoSer? DoSer is basically an HTTP Denial of Service attack that affects threaded servers. It works like this

1 Oct 12, 2021