中文生成式预训练模型

Last update: Jan 03, 2023

Related tags

Text Data & NLP t5-pegasus

Overview

T5 PEGASUS

中文生成式预训练模型，以mT5为基础架构和初始权重，通过类似PEGASUS的方式进行预训练。

详情可见：https://kexue.fm/archives/8209

Tokenizer

我们将T5 PEGASUS的Tokenizer换成了BERT的Tokenizer，它对中文更加友好。同时，我们重新整理了一版词表，使得里边的字、词都更加完善，目前的vocab.txt共包含5万个token，真正覆盖了中文的常用字、词。

预训练任务

预训练任务模仿了PEGASUS的摘要式预训练。具体来说，假设一个文档有n个句子，我们从中挑出大约n/4个句子（可以不连续），使得这n/4个句子拼起来的文本，跟剩下的3n/4个句子拼起来的文本，最长公共子序列尽可能长，然后我们将3n/4个句子拼起来的文本视为原文，n/4个句子拼起来的文本视为摘要，通过这样的方式构成一个“(原文, 摘要)”的伪摘要数据对。

模型下载

目前开源的T5 PEGASUS是base版，总参数量为2.75亿，训练时最大长度为512，batch_size为96，学习率为10^-4，使用6张3090训练了100万步，训练时间约13天，数据是30多G的精处理通用语料，训练acc约47%，训练loss约2.97。模型使用bert4keras进行编写、训练和测试。

运行环境：tensorflow 1.15 + keras 2.3.1 + bert4keras 0.10.0

链接: https://pan.baidu.com/s/1lQ9Dt9wZDO3IgiCL9tP-Ug 提取码: 3sfn

部分评测

摘要生成效果：

小样本学习：

如何引用

Bibtex：

@techreport{zhuiyit5pegasus,
  title={T5 PEGASUS - ZhuiyiAI},
  author={Jianlin Su},
  year={2021},
  url="https://github.com/ZhuiyiTechnology/t5-pegasus",
}

联系我们

邮箱：[email protected] 追一科技：https://zhuiyi.ai

中文生成式预训练模型

Related tags

Overview

T5 PEGASUS

Tokenizer

预训练任务

模型下载

部分评测

如何引用

联系我们

Owner

TruthfulQA: Measuring How Models Imitate Human Falsehoods

2021海华AI挑战赛·中文阅读理解·技术组·第三名

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

Rank-One Model Editing for Locating and Editing Factual Knowledge in GPT

⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡

Auto translate textbox from Japanese to English or Indonesia

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Auto-researching tool generating word documents.

Asr abc - Automatic speech recognition(ASR),中文语音识别

A 10000+ hours dataset for Chinese speech recognition

NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

2021语言与智能技术竞赛：机器阅读理解任务

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

An implementation of the Pay Attention when Required transformer

Twitter-NLP-Analysis - Twitter Natural Language Processing Analysis

DAGAN - Dual Attention GANs for Semantic Image Synthesis

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

Just Another Telegram Ai Chat Bot Written In Python With Pyrogram.

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

BookNLP, a natural language processing pipeline for books