Fine-tune GPT-3 with a Google Chat conversation history

Last update: Dec 10, 2022

Related tags

Overview

Google Chat GPT-3

This repo will help you fine-tune GPT-3 with a Google Chat conversation history. The trained model will be able to converse as one or both sides of the conversation in the participants' style.

Download your Chat archive from Google Takeout.
Locate the messages.json file of the conversation you would like to use as a training set.
Use the script to prepare data for training:

python preparer.py --messages <MESSAGES_FILE> --output <TRAINING_FILE>

Test your training data with OpenAI's tool:

openai tools fine_tunes.prepare_data -f <TRAINING_FILE>

You should see: No remediations found.

Fine-tine GPT-3 with your training data:

openai api fine_tunes.create -t <TRAINING_FILE>

You should see: Job complete! Status: succeeded 🎉. Don't forget to note the name of the model.

Try out your model in the Playground or with the CLI:

openai api completions.create -m

Owner

Nate Baer

Software engineer at Procore. CS bachelors from RPI. Software engineering masters from UCI.

GitHub Repository

Subtitle Workshop (subshop): tools to download and synchronize subtitles

SUBSHOP Tools to download, remove ads, and synchronize subtitles. SUBSHOP Purpose Limitations Required Web Credentials Installation, Configuration, an

4 Feb 13, 2022

A method for cleaning and classifying text using transformers.

NLP Translation and Classification The repository contains a method for classifying and cleaning text using NLP transformers. Overview The input data

0 Nov 15, 2022

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

VADER-Sentiment-Analysis VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifica

3.8k Dec 30, 2022

SciBERT is a BERT model trained on scientific text.

1.2k Dec 24, 2022

Automatic privilege escalation for misconfigured capabilities, sudo and suid binaries

GTFONow Automatic privilege escalation for misconfigured capabilities, sudo and suid binaries. Features Automatically escalate privileges using miscon

101 Jan 03, 2023

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

1.6k Dec 27, 2022

AI_Assistant - This is a Python based Voice Assistant.

This is a Python based Voice Assistant. This was programmed to increase my understanding of python and also how the in-general Voice Assistants work.

1 Jan 06, 2022

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

POS-Tagger This repository details the creation of a Part-of-Speech tagger using Trigram Hidden Markov Models to predict word tags in a word sequence.

1 Dec 09, 2021

🌐 Translation microservice powered by AI

Dot Translate 🌐 A microservice for quick and local translation using A.I. This service starts a local webserver used for neural machine translation.

48 Nov 22, 2022

Knowledge Graph,Question Answering System，基于知识图谱和向量检索的医疗诊断问答系统

823 Dec 28, 2022

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

Seq2Seq Speech in JAX A JAX/Flax repository for combining a pre-trained speech encoder model (e.g. Wav2Vec2, HuBERT, WavLM) with a pre-trained text de

21 Dec 14, 2022

Code for using and evaluating SpanBERT.

SpanBERT This repository contains code and models for the paper: SpanBERT: Improving Pre-training by Representing and Predicting Spans. If you prefer

798 Dec 30, 2022

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

anaGo anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as nam

1.5k Dec 05, 2022

中文无监督SimCSE Pytorch实现

A PyTorch implementation of unsupervised SimCSE SimCSE: Simple Contrastive Learning of Sentence Embeddings 1. 用法无监督训练 python train_unsup.py ./data/ne

99 Dec 23, 2022

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (

1.4k Dec 29, 2022

Data loaders and abstractions for text and NLP

torchtext This repository consists of: torchtext.data: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vecto

3.2k Dec 30, 2022

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

Splinter This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection", to

88 Dec 31, 2022