The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Overview

Easy-to-use toolkit for retrieval-based Chatbot

Our released data can be found at this link. Make sure the following steps are adopted to use our codes.

How to Use

  1. Init the repo

    Before using the repo, please run the following command to init:

    # create the necessay folders
    python init.py
    
    # prepare the environment
    # if some package cannot be installed, just google and install it from other ways
    pip install -r requirements.txt
  2. train the model

    ./scripts/train.sh <dataset_name> <model_name> <cuda_ids>
  3. test the model [rerank]

    ./scripts/test_rerank.sh <dataset_name> <model_name> <cuda_id>
  4. test the model [recal]

    # different recall_modes are available: q-q, q-r
    ./scripts/test_recall.sh <dataset_name> <model_name> <cuda_id>
  5. inference the responses and save into the faiss index

    Somethings inference will missing data samples, please use the 1 gpu (faiss-gpu search use 1 gpu quickly)

    It should be noted that: 1. For writer dataset, use extract_inference.py script to generate the inference.txt 2. For other datasets(douban, ecommerce, ubuntu), just cp train.txt inference.txt. The dataloader will automatically read the test.txt to supply the corpus.

    # work_mode=response, inference the response and save into faiss (for q-r matching) [dual-bert/dual-bert-fusion]
    # work_mode=context, inference the context to do q-q matching
    # work_mode=gray, inference the context; read the faiss(work_mode=response has already been done), search the topk hard negative samples; remember to set the BERTDualInferenceContextDataloader in config/base.yaml
    ./scripts/inference.sh <dataset_name> <model_name> <cuda_ids>

    If you want to generate the gray dataset for the dataset:

    # 1. set the mode as the **response**, to generate the response faiss index; corresponding dataset name: BERTDualInferenceDataset;
    ./scripts/inference.sh <dataset_name> response <cuda_ids>
    
    # 2. set the mode as the **gray**, to inference the context in the train.txt and search the top-k candidates as the gray(hard negative) samples; corresponding dataset name: BERTDualInferenceContextDataset
    ./scripts/inference.sh <dataset_name> gray <cuda_ids>
    
    # 3. set the mode as the **gray-one2many** if you want to generate the extra positive samples for each context in the train set, the needings of this mode is the same as the **gray** work mode
    ./scripts/inference.sh <dataset_name> gray-one2many <cuda_ids>

    If you want to generate the pesudo positive pairs, run the following commands:

    # make sure the dual-bert inference dataset name is BERTDualInferenceDataset
    ./scripts/inference.sh <dataset_name> unparallel <cuda_ids>
  6. deploy the rerank and recall model

    # load the model on the cuda:0(can be changed in deploy.sh script)
    ./scripts/deploy.sh <cuda_id>

    at the same time, you can test the deployed model by using:

    # test_mode: recall, rerank, pipeline
    ./scripts/test_api.sh <test_mode> <dataset>
  7. test the recall performance of the elasticsearch

    Before testing the es recall, make sure the es index has been built:

    # recall_mode: q-q/q-r
    ./scripts/build_es_index.sh <dataset_name> <recall_mode>
    # recall_mode: q-q/q-r
    ./scripts/test_es_recall.sh <dataset_name> <recall_mode> 0
  8. simcse generate the gray responses

    # train the simcse model
    ./script/train.sh <dataset_name> simcse <cuda_ids>
    # generate the faiss index, dataset name: BERTSimCSEInferenceDataset
    ./script/inference_response.sh <dataset_name> simcse <cuda_ids>
    # generate the context index
    ./script/inference_simcse_response.sh <dataset_name> simcse <cuda_ids>
    # generate the test set for unlikelyhood-gen dataset
    ./script/inference_simcse_unlikelyhood_response.sh <dataset_name> simcse <cuda_ids>
    # generate the gray response
    ./script/inference_gray_simcse.sh <dataset_name> simcse <cuda_ids>
    # generate the test set for unlikelyhood-gen dataset
    ./script/inference_gray_simcse_unlikelyhood.sh <dataset_name> simcse <cuda_ids>
Owner
GMFTBY
Those who are crazy enough to think they can change the world are the ones who can.
GMFTBY
FTP Anonymous Login

FTPAnon FTP Anonymous Login Install git clone https://github.com/SiThuTuntimehacker/FTPAnon cd FTPAnon bash install.sh access ftp sever " ftpaccess.tx

SiThuTun 3 Mar 23, 2022
This bot can mention members upto 10,000 in groups and can mention members upto 200 in channels !

Mention All Bot This bot can mention members upto 10,000 in groups and can mention members upto 200 in channels ! 🏷 Infomation Language: Python. Tele

Anjana Madu 52 Dec 29, 2022
quote is a python wrapper for the Goodreads Quote API, powered by gazpacho.

About quote is a python wrapper for the Goodreads Quote API, powered by gazpacho.

Max Humber 11 Nov 10, 2022
A small repository with convenience functions for working with the Notion API.

Welcome! Within this respository are a few convenience functions to assist with the pulling and pushing of data from the Notion API.

10 Jul 09, 2022
Sends messages to a Discord webhook whenever you make a new commit to your local git repository.

Git-Notif Sends messages to a Discord webhook whenever you make a new commit to your local git repository. Usage Just drop notifier.py into your git h

1 May 29, 2022
A heraldry-related bot, designed for the Heraldry Community.

Heraldtron A heraldry-related bot, designed for the Heraldry Community. Requirements Python 3.9+ discord.py aiohttp (comes installed with discord.py)

1 Mar 31, 2022
A wrapper for the Discord Python Pixels API.

DPYPX A simple wrapper around Python Discord Pixels. Requires Python 3.7+ (3.x where x = 7). Requires pillow and aiohttp from pip. Example import dpy

Artemis 3 Oct 01, 2022
A Python wrapper around the Pushbullet API to send different types of push notifications to your phone or/and computer.

pushbullet-python A Python wrapper around the Pushbullet API to send different types of push notifications to your phone or/and computer. Installation

Janu Lingeswaran 1 Jan 14, 2022
Photogrammetry Web API

OpenScanCloud Photogrammetry Web API Overview / Outline: The OpenScan Cloud is intended to be a decentralized, open and free photogrammetry web API. T

Thomas 86 Jan 05, 2023
Discord Bot written in Python that plays music in your voice channel

Discord Bot that plays music! I decided to create a simple Discord bot using Python in order to advance my coding skills. Please don't ask me for help

Eric Yeung 39 Jan 01, 2023
A discord program that will send a message to nearly every user in a discord server

Discord Mass DM Scrapes users from a discord server to promote/mass dm Report Bug · Request Feature Features Asynchronous Easy to use Free Auto scrape

dropout 56 Jan 02, 2023
A stable telegram bot to get restricted messages with custom thumbnail support

Save restricted content Bot A stable telegram bot to get restricted messages with custom thumbnail support

DEVANSH 3 Feb 09, 2022
Black-hat with python

black-hat_python Advantages - More advance tool Easy to use allows updating tool update - run bash update.sh Here -: Command to install tool main- clo

Hackers Tech 2 Feb 10, 2022
Github action for automatically determine the version for next release by using repository tags

This action will automatically determine the version for next release by using repository tags

Igor Gov 7 Oct 25, 2022
LyricsGenius: a Python client for the Genius.com API

LyricsGenius: a Python client for the Genius.com API lyricsgenius provides a simple interface to the song, artist, and lyrics data stored on Genius.co

KevinChunye 2 Jun 30, 2022
WIOpy - Walmart Affiliate API Python wrapper

WalmartIO Python Wrapper - WIOpy A python wrapper for the Walmart io API. Only s

6 Nov 14, 2022
Balanced API library in python.

Balanced Online Marketplace Payments v1.x requires Balanced API 1.1. Use v0.x for Balanced API 1.0. Installation pip install balanced Usage View Bala

Balanced 70 Oct 04, 2022
Python-based Snapchat score booster using pyautogui module

Snapchat Snapscore Botter Python-based Snapchat score booster using pyautogui module. Click here to report bugs. Usage Download ZIP here and extract t

477 Dec 31, 2022
Spotify playlist anonymizer.

Spotify heavily personalizes auto-generated playlists like Song Radio based on the music you've listened to in the past. But sometimes you want to listen to Song Radio precisely to hear some fresh so

Jakob de Maeyer 9 Nov 27, 2022