A high-level yet extensible library for fast language model tuning via automatic prompt search

Last update: Dec 07, 2022

Related tags

Overview

ruPrompts

ruPrompts is a high-level yet extensible library for fast language model tuning via automatic prompt search, featuring integration with HuggingFace Hub, configuration system powered by Hydra, and command line interface.

Prompt is a text instruction for language model, like

Translate English to French:
cat =>

For some tasks the prompt is obvious, but for some it isn't. With ruPrompts you can define only the prompt format, like {text}, and train it automatically for any task, if you have a training dataset.

You can currently use ruPrompts for text-to-text tasks, such as summarization, detoxification, style transfer, etc., and for styled text generation, as a special case of text-to-text.

Features

Modular structure for convenient extensibility
Integration with HF Transformers, support for all models with LM head
Integration with HF Hub for sharing and loading pretrained prompts
CLI and configuration system powered by Hydra
Pretrained prompts for ruGPT-3

Installation

ruPrompts can be installed with pip:

pip install ruprompts[hydra]

See Installation for other installation options.

Usage

Loading a pretrained prompt for styled text generation:

>> ppln_joke("Говорит кружка ложке") [{"generated_text": 'Говорит кружка ложке: "Не бойся, не утонешь!".'}]">

>>> import ruprompts
>>> from transformers import pipeline

>>> ppln_joke = pipeline("text-generation-with-prompt", prompt="konodyuk/prompt_rugpt3large_joke")
>>> ppln_joke("Говорит кружка ложке")
[{"generated_text": 'Говорит кружка ложке: "Не бойся, не утонешь!".'}]

For text2text tasks:

>> ppln_detox("Опять эти тупые дятлы все испортили, чтоб их черти взяли") [{"generated_text": 'Опять эти люди все испортили'}]">

>>> ppln_detox = pipeline("text2text-generation-with-prompt", prompt="konodyuk/prompt_rugpt3large_detox_russe")
>>> ppln_detox("Опять эти тупые дятлы все испортили, чтоб их черти взяли")
[{"generated_text": 'Опять эти люди все испортили'}]

Proceed to Quick Start for a more detailed introduction or start using ruPrompts right now with our Colab Tutorials.

License

ruPrompts is Apache 2.0 licensed. See the LICENSE file for details.

A high-level yet extensible library for fast language model tuning via automatic prompt search

Related tags

Overview

ruPrompts

Features

Installation

Usage

License

Owner

Sber AI

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Unsupervised Abstract Reasoning for Raven’s Problem Matrices

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

NLP command-line assistant powered by OpenAI

SimCTG - A Contrastive Framework for Neural Text Generation

Built for cleaning purposes in military institutions

Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Korean extractive summarization. 2021 AI 텍스트 요약 온라인 해커톤 화성갈끄니까팀 코드

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Opal-lang - A WIP programming language based on Python

CoNLL-English NER Task (NER in English)

Mapping a variable-length sentence to a fixed-length vector using BERT model

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Sapiens is a human antibody language model based on BERT.

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Autoregressive Entity Retrieval

Beyond Paragraphs: NLP for Long Sequences

Open-World Entity Segmentation

A framework for implementing federated learning