Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Related tags

DocumentationP3Ranker
Overview

P3 Ranker

Implementation for our SIGIR2022 accepted paper:

P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Project Structures

├── commands
│   ├── bert.sh
│   ├── p3ranker.sh
│   ├── prop_ft.sh
│   ├── roberta.sh
│   └── t5v11.sh
├── Prefinetune
│   ├── mnli_dataloader.py
│   ├── mnli_dataset.py
│   ├── mnli_model.py
│   ├── README.md
│   ├── train_mnli.sh
│   ├── train_nq.sh
│   ├── train.py
│   └── utils.py
├── src
│   ├── data
│   │    ├── datasets
│   │    │   ├── __init__.py
│   │    │   ├── bert_dataset.py
│   │    │   ├── bertmaxp_dataset.py
│   │    │   ├── dataset.py
│   │    │   ├── edrm_dataset.py
│   │    │   ├── roberta_dataset.py
│   │    │   └── t5_dataset.py
│   │    └── tokenizers
│   │        ├── __init__.py
│   │        ├── tokenizer.py
│   │        └── word_tokenizer.py
│   ├── extractors
│   │    ├── __init__.py
│   │    └── classic_extractor.py
│   ├── metrics
│   │    ├── __init__.py
│   │    └── metric.py
│   ├── models
│   │    ├── __init__.py
│   │    ├── bert_maxp.py
│   │    ├── bert_prompt_.py
│   │    ├── bert.py
│   │    ├── conv_knrm.py
│   │    ├── edrm.py
│   │    ├── knrm.py
│   │    ├── t5.py
│   │    └── tk.py
│   ├── modules
│   │    ├── attentons
│   │    │   ├── __init__.py
│   │    │   ├── multi_head_attention.py
│   │    │   └── scaled_dot_product_attention.py
│   │    ├── embedders
│   │    │   ├── __init__.py
│   │    │   └── embedder.py
│   │    ├── encoders
│   │    │   ├── __init__.py
│   │    │   ├── cnn_encoder.py
│   │    │   ├── feed_forward_encoder.py
│   │    │   ├── positional_encoder.py
│   │    │   └── transformer_encoder.py
│   │    └── matchers
│   │        ├── __init__.py
│   │        └── kernel_matcher.py
│   ├── __init__.py
│   └── utils.py
├── README.md
├── requirements.txt
├── train.py
└── utils.py 

Prerequisites

Install dependencies:

git clone https://github.com/NEUIR/P3Ranker.git
cd P3-Rankers
pip install -r requirements.txt

Data Preparation

We will release our few-shot dataset soon.

Prompt Generation

Details about the Discrete Prompt Generation can be find in https://github.com/princeton-nlp/LM-BFF and our paper

Prefinetune

cd Reproduce

And you will find how to do prefinetune.

Reproduce our results

Directly run the scripts we stored in './commands' can reproduce our results. One example is shown below:

bash commands/bert.sh 5

The above command is for reproducing results in our 5-q few-shot scenarios mentioned in our paper.

Contact

Please send email to [email protected].

The tutorial is a collection of many other resources and my own notes

Why we need CTC? --- looking back on history 1.1. About CRNN 1.2. from Cross Entropy Loss to CTC Loss Details about CTC 2.1. intuition: forward algor

手写AI 7 Sep 19, 2022
🐱‍🏍 A curated list of awesome things related to Hugo themes.

awesome-hugo-themes Automated deployment @ 2021-10-12 06:24:07 Asia/Shanghai &sorted=updated Theme Author License GitHub Stars Updated Blonde wamo MIT

13 Dec 12, 2022
An awesome Data Science repository to learn and apply for real world problems.

AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s

Academic.io 20.3k Jan 09, 2023
FireEye Related Projects

FireEye FireEye Related Projects Tor-IP-Collector Simple python script that will collect a list of TOR IPs from the SecOps Institute Github and inject

Taran Ulrich 2 Nov 12, 2022
A clean customizable documentation theme for Sphinx

A clean customizable documentation theme for Sphinx

Pradyun Gedam 1.5k Jan 06, 2023
Python code for working with NFL play by play data.

nfl_data_py nfl_data_py is a Python library for interacting with NFL data sourced from nflfastR, nfldata, dynastyprocess, and Draft Scout. Includes im

82 Jan 05, 2023
Loudchecker - Python script to check files for earrape

loudchecker python script to check files for earrape automatically installs depe

1 Jan 22, 2022
Your Project with Great Documentation.

Read Latest Documentation - Browse GitHub Code Repository The only thing worse than documentation never written, is documentation written but never di

Timothy Edmund Crosley 809 Dec 28, 2022
Compare two CSV files for differences. Colorize the differences and align the columns.

pretty-csv-diff Compare two CSV files for differences. Colorize the differences and align the columns. Command-Line Example Command-Line Usage usage:

Devon 6 Dec 29, 2022
Preview title and other information about links sent to chats.

Link Preview A small plugin for Nicotine+ to display preview information like title and description about links sent in chats. Plugin created with Nic

Nick 0 Sep 05, 2021
AiiDA plugin for the HyperQueue metascheduler.

aiida-hyperqueue WARNING: This plugin is still in heavy development. Expect bugs to pop up and the API to change. AiiDA plugin for the HyperQueue meta

AiiDA team 3 Jun 19, 2022
The source code that powers readthedocs.org

Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex

Read the Docs 7.4k Dec 25, 2022
python wrapper for simple-icons

simpleicons Use a wide-range of icons derived from the simple-icons repo in python. Go to their website for a full list of icons. The slug version mus

Sachin Raja 14 Nov 07, 2022
Some custom tweaks to the results produced by pytkdocs.

pytkdocs_tweaks Some custom tweaks for pytkdocs. For use as part of the documentation-generation-for-Python stack that comprises mkdocs, mkdocs-materi

Patrick Kidger 4 Nov 24, 2022
level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

[AI Tech 3기 Level2 P Stage] 글자 검출 대회 팀원 소개 김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182 Overview OCR (Optimal Character Recognition) 기술

6 Jun 10, 2022
An open-source script written in python just for fun

Owersite Owersite is an open-source script written in python just for fun. It do

大きなペニスを持つ少年 7 Sep 21, 2022
Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

time-series-kafka-demo Mock stream producer for time series data using Kafka. I walk through this tutorial and others here on GitHub and on my Medium

Maria Patterson 26 Nov 15, 2022
Second version of SQL-PYTHON-Practicas

SQLite-Python Acerca de | Autor Sobre el repositorio Segunda version de SQL-PYTHON-Practicas 💻 Tecnologias Visual Studio Code Python SQLite3 📖 Requi

1 Jan 06, 2022
ACPOA plugin creation helper

ACPOA Plugin What is ACPOA ACPOA is the acronym for "Application Core for Plugin Oriented Applications". It's a tool to create flexible and extendable

Leikt Sol'Reihin 1 Oct 20, 2021