Honor's thesis project analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry.

Last update: Jan 09, 2022

Related tags

Overview

gpt2-poetry

The following code is for my senior honor's thesis project, under the guidance of Dr. Keith Holyoak at the University of California, Los Angeles.

I am currently analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry by utilizing the GPT-2 architecture (code originated from "Language Models are Unsupervised Multitask Learners" by Radford et. al., paper at this link: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) to generate poetry trained on two different corpora: a corpora of sonnets (fourteen lined, rhymed poems) and another corpora of free-verse poetry from ten to eighteen lines selected from Poetry Magazine's issues from January 2012 - December 2021. I plan to compare the quality of these poems to randomly selected human-written poems from each of the training sets through a participant survey on the different characteristics of poetry.

To run: install Python 3.9.8, as well as the following modules: Fire 0.1.3, Regex 2017.4.5, Requests 2.21.0, tqdm 4.31.1, and toposort 1.5.

This project is in process and solely the free-verse portion of the data is currently uploaded to Github. The sonnets generated by the GPT-2 model will be uploaded soon!

Last updated: 1/5/2021

Honor's thesis project analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry.

Related tags

Overview

gpt2-poetry

Owner

Ashley Kim

Sploitus - Command line search tool for sploitus.com. Think searchsploit, but with more POCs

LeBenchmark: a reproducible framework for assessing SSL from speech

Korean stereoypte detector with TUNiB-Electra and K-StereoSet

My implementation of Safaricom Machine Learning Codility test. The code has bugs, logical I guess I made errors and any correction will be appreciated.

Weaviate demo with the text2vec-openai module

NLP project that works with news (NER, context generation, news trend analytics)

An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.

🏖 Easy training and deployment of seq2seq models.

Generating Korean Slogans with phonetic and structural repetition

Training RNNs as Fast as CNNs

Utilizing RBERT model for KLUE Relation Extraction task

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

A full spaCy pipeline and models for scientific/biomedical documents.

Anuvada: Interpretable Models for NLP using PyTorch

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

A fast hierarchical dimensionality reduction algorithm.

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

FactSumm: Factual Consistency Scorer for Abstractive Summarization