Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Overview

Query Variation Generators

This repository contains the code and annotation data for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators".

Setup

Install the requirements using

pip install -r requirements.txt

Steps to reproduce the results

First we need to generate_weak supervsion for the desired test sets. We can do that with the scripts/generate_weak_supervision.py. In the paper we test for TREC-DL ('msmarco-passage/trec-dl-2019/judged') and ANTIQUE ('antique/train/split200-valid'), but any IR-datasets (https://ir-datasets.com/index.html) can be used here (as TASK).

python ${REPO_DIR}/examples/generate_weak_supervision.py 
    --task $TASK \
    --output_dir $OUT_DIR 

This will generate one query variation for each method for the original queries. After this, we manually annotated the query variations generated, in order to keep only valid ones for analysis. For that we use analyze_weak_supervision.py (prepares data for manual anotation) and analyze_auto_query_generation_labeling.py (combines auto labels and anotations.).

However, for reproducing the results we can directly use the annotated query set to test neural ranking models robustness (RQ1):

python ${REPO_DIR}/disentangled_information_needs/evaluation/query_rewriting.py \
        --task 'irds:msmarco-passage/trec-dl-2019/judged' \
        --output_dir $OUT_DIR/ \
        --variations_file $OUT_DIR/$VARIATIONS_FILE_TREC_DL \
        --retrieval_model_name "BM25+KNRM" \
        --train_dataset "irds:msmarco-passage/train" \
        --max_iter $MAX_ITER

by using the annotated variations file directly here "$OUT_DIR/$VARIATIONS_FILE_TREC_DL". The same can be done to run rank fusion (RQ2) by replacing query_rewriting.py with rank_fusion.py.

The scripts evaluate_weak_supervision.sh and evaluate_rank_fusion.sh run all models and datasets for both research questions . The first generates the main table of results, Table 4 in the paper, and the second generates the tables for the rank fusion experiments (only available in the Arxiv version of the paper).

Modules and Folders

  • scripts: Contain most of the analysis scripts and also commands to run entire experiments.
  • examples: Contain an example on how to generate query variations.
  • disentangled_information_needs/evaluation: Scripts to evaluate robustness of models for query variations and also to evaluate rank fusion of query variations.
  • disentangled_information_needs/transformations: Methods to generate query variations.
Owner
Gustavo Penha
Researcher - IR - RecSys - ML - NLP. https://linktr.ee/guzpenha
Gustavo Penha
A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

Somnus `Chen 2 Jun 09, 2022
A simple implementation of Kalman filter in Multi Object Tracking

kalman Filter in Multi-object Tracking A simple implementation of Kalman filter in Multi Object Tracking 本实现是在https://github.com/liuchangji/kalman-fil

124 Dec 29, 2022
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022
Official implementation of "Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks", NeurIPS 2021.

PHDimGeneralization Official implementation of "Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks", NeurIPS 2021. Overvie

Tolga Birdal 13 Nov 08, 2022
Code of TIP2021 Paper《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet and Pytorch versions.

SFace Code of TIP2021 Paper 《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet, PyTorch and Jittor versi

Zhong Yaoyao 47 Nov 25, 2022
Real Time Object Detection and Classification using Yolo Algorithm.

Real time Object detection & Classification using YOLO algorithm. Real Time Object Detection and Classification using Yolo Algorithm. What is Object D

Ketan Chawla 1 Apr 17, 2022
Mercury: easily convert Python notebook to web app and share with others

Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad

MLJAR 2.2k Dec 27, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 06, 2023
TC-GNN with Pytorch integration

TC-GNN (Running Sparse GNN on Dense Tensor Core on Ampere GPU) Cite this project and paper. @inproceedings{TC-GNN, title={TC-GNN: Accelerating Spars

YUKE WANG 19 Dec 01, 2022
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
Easily benchmark PyTorch model FLOPs, latency, throughput, max allocated memory and energy consumption

⏱ pytorch-benchmark Easily benchmark model inference FLOPs, latency, throughput, max allocated memory and energy consumption Install pip install pytor

Lukas Hedegaard 21 Dec 22, 2022
Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition"

RandWireNN Unofficial PyTorch Implementation of: Exploring Randomly Wired Neural Networks for Image Recognition. Results Validation result on Imagenet

Seung-won Park 684 Nov 02, 2022
FwordCTF 2021 Infrastructure and Source code of Web/Bash challenges

FwordCTF 2021 You can find here the source code of the challenges I wrote (Web and Bash) in FwordCTF 2021 and the source code of the platform with our

Kahla 5 Nov 25, 2022
The project covers common metrics for super-resolution performance evaluation.

Super-Resolution Performance Evaluation Code The project covers common metrics for super-resolution performance evaluation. Metrics support The script

xmy 10 Aug 03, 2022
TrTr: Visual Tracking with Transformer

TrTr: Visual Tracking with Transformer We propose a novel tracker network based on a powerful attention mechanism called Transformer encoder-decoder a

趙 漠居(Zhao, Moju) 66 Dec 27, 2022
Implement of homography net by pytorch

HomographyNet Implement of homography net by pytorch Brief Introduction This project is based on the work Homography-Net: @article{detone2016deep, t

ronghao_CN 4 May 19, 2022
Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

CLIP-Guided-Diffusion Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab. Original colab notebooks by Ka

Nerdy Rodent 336 Dec 09, 2022
《Dual-Resolution Correspondence Network》(NeurIPS 2020)

Dual-Resolution Correspondence Network Dual-Resolution Correspondence Network, NeurIPS 2020 Dependency All dependencies are included in asset/dualrcne

Active Vision Laboratory 45 Nov 21, 2022
PyTorch implementation of some learning rate schedulers for deep learning researcher.

pytorch-lr-scheduler PyTorch implementation of some learning rate schedulers for deep learning researcher. Usage WarmupReduceLROnPlateauScheduler Visu

Soohwan Kim 59 Dec 08, 2022
Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization This repository contains the source code for the paper (link wi

Rakuten Group, Inc. 0 Nov 19, 2021