mkultra

mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo.

Prompt tuning injects a string of 20-100 special tokens into the context in order to influence text generation. These tokens are trained on a corpus much like a finetune, but take up a fraction of the space. The Neuromancer example is only 401kb for 100 tokens.

Read the original paper: https://arxiv.org/abs/2104.08691

Text Generation

model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")
prompt = sp + "The sky over the port"
output = generator(prompt)

SoftPrompts can be concatenated at any point into your context as if they were strings. When the context is printed, SoftPrompts show up as human-readable tags for debugging. They also tokenize to the underlying number of tokens for easy budgeting.

See the text generation notebook for pointers on adding mkultra to your generator.

Training

For finetune-like soft prompts, the finetune notebook demonstrates training on a corpus.

For AI text adventures or writing, the World Info notebook notebook demonstrates tuning a soft prompt to describe a character or setting. This is highly experimental.

Limitations (for now)

The Huggingface Trainer class should work as long as you set params=[model.get_soft_params()] on the optimizer, but it will still save full model checkpoints.
mkultra syncs a set of special tokens between its tokenizers the scenes. Adding your own tokens may result in unexpected behaviour.

Prompt tuning toolkit for GPT-2 and GPT-Neo

Related tags

Overview

mkultra

Text Generation

Training

Limitations (for now)

Owner

IEEEXtreme15.0 Questions And Answers

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

Entity Disambiguation as text extraction (ACL 2022)

PyTorch Implementation of "Bridging Pre-trained Language Models and Hand-crafted Features for Unsupervised POS Tagging" (Findings of ACL 2022)

Large-scale Knowledge Graph Construction with Prompting

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

华为商城抢购手机的Python脚本 Python script of Huawei Store snapping up mobile phones

Text vectorization tool to outperform TFIDF for classification tasks

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Graph4nlp is the library for the easy use of Graph Neural Networks for NLP

An open source library for deep learning end-to-end dialog systems and chatbots.

Natural Language Processing Tasks and Examples.

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

Shared code for training sentence embeddings with Flax / JAX

Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET

Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

A look-ahead multi-entity Transformer for modeling coordinated agents.

Tracking Progress in Natural Language Processing

Reformer, the efficient Transformer, in Pytorch