The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Overview

Easy-to-use toolkit for retrieval-based Chatbot

Our released data can be found at this link. Make sure the following steps are adopted to use our codes.

How to Use

  1. Init the repo

    Before using the repo, please run the following command to init:

    # create the necessay folders
    python init.py
    
    # prepare the environment
    # if some package cannot be installed, just google and install it from other ways
    pip install -r requirements.txt
  2. train the model

    ./scripts/train.sh <dataset_name> <model_name> <cuda_ids>
  3. test the model [rerank]

    ./scripts/test_rerank.sh <dataset_name> <model_name> <cuda_id>
  4. test the model [recal]

    # different recall_modes are available: q-q, q-r
    ./scripts/test_recall.sh <dataset_name> <model_name> <cuda_id>
  5. inference the responses and save into the faiss index

    Somethings inference will missing data samples, please use the 1 gpu (faiss-gpu search use 1 gpu quickly)

    It should be noted that: 1. For writer dataset, use extract_inference.py script to generate the inference.txt 2. For other datasets(douban, ecommerce, ubuntu), just cp train.txt inference.txt. The dataloader will automatically read the test.txt to supply the corpus.

    # work_mode=response, inference the response and save into faiss (for q-r matching) [dual-bert/dual-bert-fusion]
    # work_mode=context, inference the context to do q-q matching
    # work_mode=gray, inference the context; read the faiss(work_mode=response has already been done), search the topk hard negative samples; remember to set the BERTDualInferenceContextDataloader in config/base.yaml
    ./scripts/inference.sh <dataset_name> <model_name> <cuda_ids>

    If you want to generate the gray dataset for the dataset:

    # 1. set the mode as the **response**, to generate the response faiss index; corresponding dataset name: BERTDualInferenceDataset;
    ./scripts/inference.sh <dataset_name> response <cuda_ids>
    
    # 2. set the mode as the **gray**, to inference the context in the train.txt and search the top-k candidates as the gray(hard negative) samples; corresponding dataset name: BERTDualInferenceContextDataset
    ./scripts/inference.sh <dataset_name> gray <cuda_ids>
    
    # 3. set the mode as the **gray-one2many** if you want to generate the extra positive samples for each context in the train set, the needings of this mode is the same as the **gray** work mode
    ./scripts/inference.sh <dataset_name> gray-one2many <cuda_ids>

    If you want to generate the pesudo positive pairs, run the following commands:

    # make sure the dual-bert inference dataset name is BERTDualInferenceDataset
    ./scripts/inference.sh <dataset_name> unparallel <cuda_ids>
  6. deploy the rerank and recall model

    # load the model on the cuda:0(can be changed in deploy.sh script)
    ./scripts/deploy.sh <cuda_id>

    at the same time, you can test the deployed model by using:

    # test_mode: recall, rerank, pipeline
    ./scripts/test_api.sh <test_mode> <dataset>
  7. test the recall performance of the elasticsearch

    Before testing the es recall, make sure the es index has been built:

    # recall_mode: q-q/q-r
    ./scripts/build_es_index.sh <dataset_name> <recall_mode>
    # recall_mode: q-q/q-r
    ./scripts/test_es_recall.sh <dataset_name> <recall_mode> 0
  8. simcse generate the gray responses

    # train the simcse model
    ./script/train.sh <dataset_name> simcse <cuda_ids>
    # generate the faiss index, dataset name: BERTSimCSEInferenceDataset
    ./script/inference_response.sh <dataset_name> simcse <cuda_ids>
    # generate the context index
    ./script/inference_simcse_response.sh <dataset_name> simcse <cuda_ids>
    # generate the test set for unlikelyhood-gen dataset
    ./script/inference_simcse_unlikelyhood_response.sh <dataset_name> simcse <cuda_ids>
    # generate the gray response
    ./script/inference_gray_simcse.sh <dataset_name> simcse <cuda_ids>
    # generate the test set for unlikelyhood-gen dataset
    ./script/inference_gray_simcse_unlikelyhood.sh <dataset_name> simcse <cuda_ids>
Owner
GMFTBY
Those who are crazy enough to think they can change the world are the ones who can.
GMFTBY
AWS SQS event redrive Lambda

This repository contains the Lambda function to redrive sqs events from source to destination queue while controlling maxRetry per event.

1 Oct 19, 2021
Instant messaging client in tkinter

Concord_client_tk Instant messaging client in tkinter Contributors : Ilade-s [https://github.com/Ilade-s] Doku [https://github.com/D0kuhebi] Descripti

Raphaël Merlet 2 Jun 15, 2022
Telegram bot for searching videos in your PDisk account by @AbirHasan2005

PDisk-Videos-Search A Telegram bot for searching videos in your PDisk account by @AbirHasan2005. Configs API_ID - Get from @TeleORG_Bot API_HASH - Get

Abir Hasan 39 Oct 21, 2022
Телеграм бот решающий задания ЦДЗ, написанный на библиотеке libmesh.

MESHBot-Telegram Телеграм бот решающий задания ЦДЗ. Описание: Бот написан с использованием библиотеки libmesh. Для начала работы отправьте ему ссылку

2 Jun 19, 2022
Turns any script into a telegram bot

pytobot Turns any script into a telegram bot Install pip install --upgrade pytobot Usage Script: while True: message = input() if message == "

Dmitry Kotov 17 Jan 06, 2023
SOLSEA-NFT-EXPLORE - Using Streamlit to build a simple UI on top of the Solana API

SOLSEA NFT Explorer Using Streamlit to build a simple UI on top of the Solana AP

Devin Capriola 3 Mar 19, 2022
Send notification to your telegram group/channel/private whenever a new video is uploaded on a youtube channel!

YouTube Feeds Bot. Send notification to your telegram group/channel/private whenever a new video is uploaded on a youtube channel! Variables BOT_TOKEN

Aditya 30 Dec 07, 2022
Exchange indicators & Basic functions for Binance API.

binance-ema Exchange indicators & Basic functions for Binance API. This python library has been written to calculate SMA, EMA, MACD etc. functions wit

Emre MENTEŞE 24 Jan 06, 2023
Huan Xu 1.6k Jan 04, 2023
Automatic login to Microsoft Teams conferences

Automatic login to Microsoft Teams conferences

Xhos 1 Jan 24, 2022
Online Marketplace API

Online Marketplace API Table of Contents Setup Instructions Documentation Setup instructions Make sure you have python installed Clone the repository

Kanat 3 Jul 13, 2022
数字货币动态趋势网格,随着行情变动。目前实盘月化10%。目前支持币安,未来上线火币、OKEX。

数字货币动态趋势网格,随着行情变动。目前实盘月化10%。目前支持币安,未来上线火币、OKEX。

幸福村的码农 98 Dec 27, 2022
NekoRobot-2 - Neko is An Anime themed advance Telegram group management bot.

NekoRobot A modular telegram Python bot running on python3 with an sqlalchemy, mongodb database. ╒═══「 Status 」 Maintained Support Group Included Free

Lovely Boy 19 Nov 12, 2022
Python library for the eWarehousing Solutions API.

eWarehousing Solutions Python Library This library provides convenient access to the eWarehousing Solutions API from applications written in the Pytho

eWarehousing Solutions 2 Nov 09, 2022
A Flask inspired, decorator based API wrapper for Python-Slack.

A Flask inspired, decorator based API wrapper for Python-Slack. About Tangerine is a lightweight Slackbot framework that abstracts away all the boiler

Nick Ficano 149 Jun 30, 2022
A file-based quote bot written in Python

Let's Write a Python Quote Bot! This repository will get you started with building a quote bot in Python. It's meant to be used along with the Learnin

1 Feb 03, 2022
Download apps and remove icloud

Download apps and remove icloud

SaltConf21: Adding Workflow Approval to Salt

SaltConf21: Adding Workflow Approval to Salt Running To run the example, install Docker and docker-compose and run the following commands: docker-comp

SSYS Sistemas 4 Nov 24, 2021
An Async Bot/API wrapper for Twitch made in Python.

TwitchIO is an asynchronous Python wrapper around the Twitch API and IRC, with a powerful command extension for creating Twitch Chat Bots. TwitchIO co

TwitchIO 590 Jan 03, 2023
A battle-tested Django 2.1 project template with configurations for AWS, Heroku, App Engine, and Docker.

For information on how to use this project template, check out the wiki. {{ project_name }} Table of Contents Requirements Local Setup Local Developme

Lionheart Software 64 Jun 15, 2022