An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Related tags

Text Data & NLPvizseq
Overview

PyPI CircleCI PyPI - License PyPI - Python Version

VizSeq

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation and video description. It takes multi-modal sources, text references as well as text predictions as inputs, and analyzes them visually in Jupyter Notebook or a built-in Web App (the former has Fairseq integration). VizSeq also provides a collection of multi-process scorers as a normal Python package.

[Paper] [Documentation] [Blog]

VizSeq Overview VizSeq Teaser

Task Coverage

Source Example Tasks
Text Machine translation, text summarization, dialog generation, grammatical error correction, open-domain question answering
Image Image captioning, image question answering, optical character recognition
Audio Speech recognition, speech translation
Video Video description
Multimodal Multimodal machine translation

Metric Coverage

Accelerated with multi-processing/multi-threading.

Type Metrics
N-gram-based BLEU (Papineni et al., 2002), NIST (Doddington, 2002), METEOR (Banerjee et al., 2005), TER (Snover et al., 2006), RIBES (Isozaki et al., 2010), chrF (Popović et al., 2015), GLEU (Wu et al., 2016), ROUGE (Lin, 2004), CIDEr (Vedantam et al., 2015), WER
Embedding-based LASER (Artetxe and Schwenk, 2018), BERTScore (Zhang et al., 2019)

Getting Started

Installation

VizSeq requires Python 3.6+ and currently runs on Unix/Linux and macOS/OS X. It will support Windows as well in the future.

You can install VizSeq from PyPI repository:

$ pip install vizseq

Or install it from source:

$ git clone https://github.com/facebookresearch/vizseq
$ cd vizseq
$ pip install -e .

Documentation

Jupyter Notebook Examples

Fairseq integration

Web App Example

Download example data:

$ git clone https://github.com/facebookresearch/vizseq
$ cd vizseq
$ bash get_example_data.sh

Launch the web server:

$ python -m vizseq.server --port 9001 --data-root ./examples/data

And then, navigate to the following URL in your web browser:

http://localhost:9001

License

VizSeq is licensed under MIT. See the LICENSE file for details.

Citation

Please cite as

@inproceedings{wang2019vizseq,
  title = {VizSeq: A Visual Analysis Toolkit for Text Generation Tasks},
  author = {Changhan Wang, Anirudh Jain, Danlu Chen, Jiatao Gu},
  booktitle = {In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
  year = {2019},
}

Contact

Changhan Wang ([email protected]), Jiatao Gu ([email protected])

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Texar-PyTorch is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar

ASYML 726 Dec 30, 2022
Some embedding layer implementation using ivy library

ivy-manual-embeddings Some embedding layer implementation using ivy library. Just for fun. It is based on NYCTaxiFare dataset from kaggle (cut down to

Ishtiaq Hussain 2 Feb 10, 2022
TruthfulQA: Measuring How Models Imitate Human Falsehoods

TruthfulQA: Measuring How Models Imitate Human Falsehoods

69 Dec 25, 2022
Yodatranslator is a simple translator English to Yoda-language

yodatranslator Overview yodatranslator is a simple translator English to Yoda-language. Project is created for educational purposes. It is intended to

1 Nov 11, 2021
This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

Timo Schick 154 Jan 01, 2023
Just a Basic like Language for Zeno INC

zeno-basic-language Just a Basic like Language for Zeno INC This is written in 100% python. this is basic language like language. so its not for big p

Voidy Devleoper 1 Dec 18, 2021
A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to ach

Keon Lee 237 Jan 02, 2023
Script to download some free japanese lessons in portuguse from NHK

Nihongo_nhk This is a script to download some free japanese lessons in portuguese from NHK. It can be executed by installing the packages with: pip in

Matheus Alves 2 Jan 06, 2022
构建一个多源(公众号、RSS)、干净、个性化的阅读环境

2C 构建一个多源(公众号、RSS)、干净、个性化的阅读环境 作为一名微信公众号的重度用户,公众号一直被我设为汲取知识的地方。随着使用程度的增加,相信大家或多或少会有一个比较头疼的问题——广告问题。 假设你关注的公众号有十来个,若一个公众号两周接一次广告,理论上你会面临二十多次广告,实际上会更多,运

howie.hu 678 Dec 28, 2022
DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex

BuyWithCrypto 3 Apr 19, 2022
A Telegram bot to add notes to Flomo.

flomo bot 使用 Telegram 机器人发送笔记到你的 Flomo. 你需要有一台可访问 Telegram 的服务器。 Steps @BotFather 新建机器人,获取 token Flomo 官网获取 API,链接 https://flomoapp.com/mine?source=in

Zhen 44 Dec 30, 2022
BiQE: Code and dataset for the BiQE paper

BiQE: Bidirectional Query Embedding This repository includes code for BiQE and the datasets introduced in Answering Complex Queries in Knowledge Graph

Bhushan Kotnis 1 Oct 20, 2021
RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi

Stefan Dumitrescu 9 Nov 07, 2022
✔👉A Centralized WebApp to Ensure Road Safety by checking on with the activities of the driver and activating label generator using NLP.

AI-For-Road-Safety Challenge hosted by Omdena Hyderabad Chapter Original Repo Link : https://github.com/OmdenaAI/omdena-india-roadsafety Final Present

Prathima Kadari 7 Nov 29, 2022
StarGAN - Official PyTorch Implementation

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Dec 30, 2022
BiNE: Bipartite Network Embedding

BiNE: Bipartite Network Embedding This repository contains the demo code of the paper: BiNE: Bipartite Network Embedding. Ming Gao, Leihui Chen, Xiang

leihuichen 214 Nov 24, 2022
Training open neural machine translation models

Train Opus-MT models This package includes scripts for training NMT models using MarianNMT and OPUS data for OPUS-MT. More details are given in the Ma

Language Technology at the University of Helsinki 167 Jan 03, 2023
Python wrapper for Stanford CoreNLP tools v3.4.1

Python interface to Stanford Core NLP tools v3.4.1 This is a Python wrapper for Stanford University's NLP group's Java-based CoreNLP tools. It can eit

Dustin Smith 610 Sep 07, 2022
NLP-SentimentAnalysis - Coursera Course ( Duration : 5 weeks ) offered by DeepLearning.AI

Coursera Natural Language Processing Specialization This repository contains material related to Coursera Natural Language Processing Specialization.

Nishant Sharma 1 Jun 05, 2022
NLP Overview

NLP-Overview Introduction The field of NPL encompasses a variety of topics which involve the computational processing and understanding of human langu

PeterPham 1 Jan 13, 2022