Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

Overview

CoSMo.pytorch

Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung Han. *(denotes equal contribution)

Presented at CVPR2021

Paper | Poster | 5 min Video

fig

⚙️ Setup

Python: python3.7

📦 Install required packages

Install torch and torchvision via following command (CUDA10)

pip install torch==1.2.0 torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

Install other packages

pip install -r requirements.txt

📂 Dataset

Download the FashionIQ dataset by following the instructions on this link.

We have set the default path for FashionIQ datasets in data/fashionIQ.py as _DEFAULT_FASHION_IQ_DATASET_ROOT = '/data/image_retrieval/fashionIQ'. You can change this path to wherever you plan on storing the dataset.

📚 Vocabulary file

Open up a python console and run the following lines to download NLTK punkt:

import nltk
nltk.download('punkt')

Then, open up a Jupyter notebook and run jupyter_files/how_to_create_fashion_iq_vocab.ipynb. As with the dataset, the default path is set in data/fashionIQ.py.

We have provided a vocab file in jupyter_files/fashion_iq_vocab.pkl.

📈 Weights & Biases

We use Weights and Biases to log our experiments.

If you already have a Weights & Biases account, head over to configs/FashionIQ_trans_g2_res50_config.json and fill out your wandb_account_name. You can also change the default at options/command_line.py.

If you do not have a Weights & Biases account, you can either create one or change the code and logging functions to your liking.

🏃‍♂️ Run

You can run the code by the following command:

python main.py --config_path=configs/FashionIQ_trans_g2_res50_config.json --experiment_description=test_cosmo_fashionIQDress --device_idx=0,1,2,3

Note that you do not need to assign --device_idx if you have already specified CUDA_VISIBLE_DEVICES=0,1,2,3 in your terminal.

We run on 4 12GB GPUs, and the main gpu gpu:0 uses around 4GB of VRAM.

⚠️ Notes on Evaluation

In our paper, we mentioned that we use a slightly different evaluation method than the original FashionIQ dataset. This was done to match the evaluation method used by VAL.

By default, this code uses the proper evaluation method (as intended by the creators of the dataset). The results for this is shown in our supplementary materials. If you'd like to use the same evaluation method as our main paper (and VAL), head over to data/fashionIQ.py and uncomment the commented section.

📜 Citation

If you use our code, please cite our work:

@InProceedings{CoSMo2021_CVPR,
    author    = {Lee, Seungmin and Kim, Dongwan and Han, Bohyung},
    title     = {CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {802-812}
}
Owner
Seung Min Lee
Bring Me Giants
Seung Min Lee
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

256 Dec 28, 2022
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
一个多模态内容理解算法框架,其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

Overview 架构设计 插件介绍 安装使用 框架简介 方便使用,支持多模态,多任务的统一训练框架 能力列表: bert + 分类任务 自定义任务训练(插件注册) 框架设计 框架采用分层的思想组织模型训练流程。 DATA 层负责读取用户数据,根据 field 管理数据。 Parser 层负责转换原

Tencent 265 Dec 22, 2022
PyTorch code for JEREX: Joint Entity-Level Relation Extractor

JEREX: "Joint Entity-Level Relation Extractor" PyTorch code for JEREX: "Joint Entity-Level Relation Extractor". For a description of the model and exp

LAVIS - NLP Working Group 50 Dec 01, 2022
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

Kingdrone 43 Jan 05, 2023
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec

Irtiza Hasan 594 Jan 05, 2023
Easy genetic ancestry predictions in Python

ezancestry Easily visualize your direct-to-consumer genetics next to 2500+ samples from the 1000 genomes project. Evaluate the performance of a custom

Kevin Arvai 38 Jan 02, 2023
Locationinfo - A script helps the user to show network information such as ip address

Description This script helps the user to show network information such as ip ad

Roxcoder 1 Dec 30, 2021
Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process, a complete algorithm library is esta

Fu Pengyou 50 Jan 07, 2023
Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21. We optimized wind turbine placement in a wind farm, subject to wake effects, using Q-learni

Manasi Sharma 2 Sep 27, 2022
Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Uncertainty-based OOD detection Exploring the link between uncertainty estimates obtained by "exact" Bayesian inference and out-of-distribution (OOD)

Christian Henning 1 Nov 05, 2022
An evaluation toolkit for voice conversion models.

Voice-conversion-evaluation An evaluation toolkit for voice conversion models. Sample test pair Generate the metadata for evaluating models. The direc

30 Aug 29, 2022
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

Descript 240 Dec 13, 2022
PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

WuJinxuan 144 Dec 26, 2022
WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

Jieyu Zhang 176 Dec 28, 2022
Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P

Jiayi Chen 3 Mar 03, 2022
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021
Learning Open-World Object Proposals without Learning to Classify

Learning Open-World Object Proposals without Learning to Classify Pytorch implementation for "Learning Open-World Object Proposals without Learning to

Dahun Kim 149 Dec 22, 2022
Pytorch Implementation of paper "Noisy Natural Gradient as Variational Inference"

Noisy Natural Gradient as Variational Inference PyTorch implementation of Noisy Natural Gradient as Variational Inference. Requirements Python 3 Pytor

Tony JiHyun Kim 119 Dec 02, 2022
code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Facebook Research 94 Oct 26, 2022