Linear programming solver for paper-reviewer matching and mind-matching

Last update: Jul 05, 2022

Overview

Paper-Reviewer Matcher

A python package for paper-reviewer matching algorithm based on topic modeling and linear programming. The algorithm is implemented based on this article). This package solves problem of assigning paper to reviewers with constrains by solving linear programming problem. We minimize global distance between papers and reviewers in topic space (e.g. topic modeling can be Principal component, Latent Semantic Analysis (LSA), etc.).

Here is a diagram of problem setup and how we solve the problem.

Mind-Match Command Line

Mind-Match is a session we run at Cognitive Computational Neuroscience (CCN) conference. We use a combination of topic modeling and linear programming to solve optimal matching problem. To run example Mind-Match algorithm on sample of 500 people, you can clone the repository and run the following

python mindmatch.py data/mindmatch_example.csv --n_match=6 --n_trim=50

in the root of this repo. This should produce a matching output output_match.csv in this relative location. However, when people get much larger this script takes quite a long time to run. We use pre-cluster into groups before running the mind-matching to make the script runs faster. Below is an example script for pre-clustering and mind-matching on all data:

python mindmatch_cluster.py data/mindmatch_example.csv --n_match=6 --n_trim=50 --n_clusters=4

Example script for the conferences

Here, I include a recent scripts for our Mind Matching session for CCN conference.

ccn_mind_matching_2019.py contains script for Mind Matching session (match scientists to scientists) for CCN conference
ccn_paper_reviewer_matching.py contains script for matching publications to reviewers for CCN conference, see example of CSV files in data folder

The code makes the distance metric of topics between incoming papers with reviewers (for ccn_paper_reviewer_matching.py) and between people with people (for ccn_mind_matching_2019). We trim the metric so that the problem is not too big to solve using or-tools. It then solves linear programming problem to assign the best matches which minimize the global distance between papers to reviewers. After that, we make the output that can be used by the organizers of the CCN conference -- pairs of paper and reviewers or mind-matching schedule between people to people in the conference. You can see of how it works below.

Dependencies

Use pip to install dependencies

pip install -r requirements.txt

Please see Stackoverflow if you have a problem installing or-tools on MacOS. You can use pip to install protobuf before installing or-tools

pip install protobuf==3.0.0b4
pip install ortools

for Python 3.6,

pip install --user --upgrade ortools

Citations

If you use Paper-Reviewer Matcher in your work or conference, please cite us as follows

@misc{achakulvisut2018,
    author = {Achakulvisut, Titipat and Acuna, Daniel E. and Kording, Konrad},
    title = {Paper-Reviewer Matcher},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/titipata/paper-reviewer-matcher}},
    commit = {9d346ee008e2789d34034c2b330b6ba483537674}
}

Members

Daniel Acuna (original author)
Titipat Achakulvisut (refactor)
Konrad Kording

Linear programming solver for paper-reviewer matching and mind-matching

Related tags

Overview

Paper-Reviewer Matcher

Mind-Match Command Line

Example script for the conferences

Dependencies

Citations

Members

Owner

Titipat Achakulvisut

This repository describes our reproducible framework for assessing self-supervised representation learning from speech

FastFormers - highly efficient transformer models for NLU

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

NLP: SLU tagging

Pytorch version of BERT-whitening

The ibet-Prime security token management system for ibet network.

Speach Recognitions

ZUNIT - Toward Zero-Shot Unsupervised Image-to-Image Translation

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Simple NLP based project without any use of AI

A Fast Command Analyser based on Dict and Pydantic

Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models.

Athena is an open-source implementation of end-to-end speech processing engine.

An A-SOUL Text Generator Based on CPM-Distill.

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Arabic speech recognition, classification and text-to-speech.

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

Client library to download and publish models and other files on the huggingface.co hub