OpenChat: Opensource chatting framework for generative models

Overview

OpenChat: Opensource chatting framework for generative models

    ____   ____   ______ _   __   ______ __  __ ___   ______
   / __ \ / __ \ / ____// | / /  / ____// / / //   | /_  __/
  / / / // /_/ // __/  /  |/ /  / /    / /_/ // /| |  / /   
 / /_/ // ____// /___ / /|  /  / /___ / __  // ___ | / /    
 \____//_/    /_____//_/ |_/   \____//_/ /_//_/  |_|/_/     
  • OpenChat is opensource chatting framework for generative models.
  • You can talk with AI with only one line of code.



Installation

pip install openchat



Model Supports

  • DialoGPT : [small, medium, large]
  • BlenderBot : [small, medium, large, xlarge]
  • Coming Soon...



Usage

1. Terminal environment

  • Chatting on terminal.
  • Type .exit if you want to exit dialogue.
  • Type .clear if you want to clear all histories.
  • You can add additional user keywords. (Details are described below.)
>>> from openchat import OpenChat
>>> OpenChat(model="blenderbot", size="large")
Chat with facebook/blenderbot-1B-distill !
user : Hi. It is nice day.
bot : It is a nice day out. I am going to go for a walk. What are you up to?
user : I'll play computer game. Do you like playing game?
bot : I love playing video games. What kind of games do you like to play on your computer?
user : I love racing game. It is so exited. How about you?
bot : I love racing games as well. I like to play on the Playstation 4. What about you?
user : Usually I like to play on PC. I have a question. Who is CEO of Apple?
bot : The CEO of Apple is Steve Jobs. He founded the company in 1976.
user : That's right. I have an IPhone. It is my favorite phone. What is your phone?
bot : I have a Samsung Galaxy S8. I love it. It's the best phone I've ever had.
user : .exit
bot : good bye.

  • If you want to use GPU, use argument device="cuda"
>>> from openchat import OpenChat
>>> OpenChat(model="blenderbot", size="large", device="cuda")

2. Your own environment (not terminal)

  • You can use user id to manage user-specific history.
  • This can be useful when deployed on Facebook messenger or WhatsApp.
  • There is a web demo implementation in the /demo folder.


2.1. Write your own environment class

  • Make your own environment class inherited from BaseEnv
  • And implement your own run(model: BaseModel) method like below.
from typing import Dict
from flask import Flask, render_template
from flask_cors import CORS
from openchat.envs import BaseEnv
from openchat.models import BaseModel


class WebDemoEnv(BaseEnv):

    def __init__(self):
        super().__init__()
        self.app = Flask(__name__)
        CORS(self.app)

    def run(self, model: BaseModel):

        @self.app.route("/")
        def index():
            return render_template("index.html", title=model.name)

        @self.app.route('/send//', methods=['GET'])
        def send(user_id, text: str) -> Dict[str, str]:

            if text in self.keywords:
                # Format of self.keywords dictionary
                # self.keywords['/exit'] = (exit_function, 'good bye.')

                _out = self.keywords[text][1]
                # text to print when keyword triggered

                self.keywords[text][0](user_id, text)
                # function to operate when keyword triggered

            else:
                _out = model.predict(user_id, text)

            return {"output": _out}

        self.app.run(host="0.0.0.0", port=8080)

2.2. Start to run application.

from openchat import OpenChat
from demo.web_demo_env import WebDemoEnv

OpenChat(model="blenderbot", size="large", env=WebDemoEnv())



3. Additional Options

3.1. Add custom Keywords

  • You can add new manual keyword such as .exit, .clear,
  • call the self.add_keyword('.new_keyword', 'message to print', triggered_function)' method.
  • triggered_function should be form of function(user_id:str, text:str)
from openchat.envs import BaseEnv


class YourOwnEnv(BaseEnv):
    
    def __init__(self):
        super().__init__()
        self.add_keyword(".new_keyword", "message to print", self.function)

    def function(self, user_id: str, text: str):
        """do something !"""
        



3.2. Modify generation options

  • You can modify max_context_length (number of input history tokens, default is 128).
>>> OpenChat(size="large", device="cuda", max_context_length=256)

  • You can modify generation options ['num_beams', 'top_k', 'top_p'].
>>> model.predict(
...     user_id="USER_ID",
...     text="Hello.",
...     num_beams=5,
...     top_k=20,
...     top_p=0.8,
... )



3.3. Check histories

  • You can check all dialogue history using self.histories
from openchat.envs import BaseEnv


class YourOwnEnv(BaseEnv):
    
    def __init__(self):
        super().__init__()
        print(self.histories)
{
    user_1 : {'user': [] , 'bot': []},
    user_2 : {'user': [] , 'bot': []},
    ...more...
    user_n : {'user': [] , 'bot': []},
}

3.4. Clear histories

  • You can clear all dialogue histories
from flask import Flask
from openchat.envs import BaseEnv
from openchat.models import BaseModel

class YourOwnEnv(BaseEnv):
    
    def __init__(self):
        super().__init__()
        self.app = Flask(__name__)

    def run(self, model: BaseModel):
        
        @self.app.route('/send//', methods=['GET'])
        def send(user_id, text: str) -> Dict[str, str]:
            
            self.clear(user_id, text)
            # clear all histories ! 



License

Copyright 2021 Hyunwoong Ko.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
Hyunwoong Ko
Co-Founder and Research Engineer at @tunib-ai. previously @kakaobrain.
Hyunwoong Ko
Translates basic English sentences into the Huna language (hoo-NAH)

huna-translator The Huna Language Translates basic English sentences into the Huna language (hoo-NAH). The Huna constructed language was developed in

Miles Smith 0 Jan 20, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Jan 08, 2023
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context This Repository contains the code on AVA of our ACM MM 2021 paper: LSTC: Boosting

Tencent YouTu Research 9 Oct 11, 2022
KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark.

KLUE Baseline Korean(한국어) KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark. See our paper fo

74 Dec 13, 2022
This is the source code of RPG (Reward-Randomized Policy Gradient)

RPG (Reward-Randomized Policy Gradient) Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu (

40 Nov 25, 2022
📝An easy-to-use package to restore punctuation of the text.

✏️ rpunct - Restore Punctuation This repo contains code for Punctuation restoration. This package is intended for direct use as a punctuation restorat

Daulet Nurmanbetov 72 Dec 30, 2022
NLP-based analysis of poor Chinese movie reviews on Douban

douban_embedding 豆瓣中文影评差评分析 1. NLP NLP(Natural Language Processing)是指自然语言处理,他的目的是让计算机可以听懂人话。 下面是我将2万条豆瓣影评训练之后,随意输入一段新影评交给神经网络,最终AI推断出的结果。 "很好,演技不错

3 Apr 15, 2022
Code for the paper "Language Models are Unsupervised Multitask Learners"

Status: Archive (code is provided as-is, no updates expected) gpt-2 Code and models from the paper "Language Models are Unsupervised Multitask Learner

OpenAI 16.1k Jan 08, 2023
Study German declensions (dER nettE Mann, ein nettER Mann, mit dEM nettEN Mann, ohne dEN nettEN Mann ...) Generate as many exercises as you want using the incredible power of SPACY!

Study German declensions (dER nettE Mann, ein nettER Mann, mit dEM nettEN Mann, ohne dEN nettEN Mann ...) Generate as many exercises as you want using the incredible power of SPACY!

Hans Alemão 4 Jul 20, 2022
Poetry PEP 517 Build Backend & Core Utilities

Poetry Core A PEP 517 build backend implementation developed for Poetry. This project is intended to be a light weight, fully compliant, self-containe

Poetry 293 Jan 02, 2023
A library for end-to-end learning of embedding index and retrieval model

Poeem Poeem is a library for efficient approximate nearest neighbor (ANN) search, which has been widely adopted in industrial recommendation, advertis

54 Dec 21, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Antlr Project 13.6k Jan 05, 2023
LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation Tasks | Datasets | LongLM | Baselines | Paper Introduction LOT is a ben

46 Dec 28, 2022
Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog This repository contains the dataset, source code and trained model for the following pa

172 Dec 13, 2022
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Data Augmentation using Pre-trained Transformer Models Code associated with the Data Augmentation using Pre-trained Transformer Models paper Code cont

44 Dec 31, 2022
Sample data associated with the Aurora-BP study

The Aurora-BP Study and Dataset This repository contains sample code, sample data, and explanatory information for working with the Aurora-BP dataset

Microsoft 16 Dec 12, 2022
A simple Flask site that allows users to create, update, and delete posts in a database, as well as perform basic NLP tasks on the posts.

A simple Flask site that allows users to create, update, and delete posts in a database, as well as perform basic NLP tasks on the posts.

Ian 1 Jan 15, 2022
glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g

Rhasspy 8 Dec 25, 2022
Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products

Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products.

Leah Pathan Khan 2 Jan 12, 2022
MPNet: Masked and Permuted Pre-training for Language Understanding

MPNet MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-tr

Microsoft 228 Nov 21, 2022