I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

Last update: Jan 13, 2022

Overview

Sentiment-of-movie-reviews

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others make this task very challenging.

This project uses datasets available on kaggle for training and testing.

Transformers brings all these models together and makes it very easy to use each with only a few lines of code. In fact they even provide us with cool tools like pipelines or live demo that we can classify our text without any training or long periods of coding. But as you can geuss these simple and ready to use models have their weaknesses. For example, you can't classify the text with them with the number of labels you want because they've been pretrained on a text with specific labels. Also not all models used by them are as strong and accurate as we want them to be(for example the default model for sentiment analysis is uncased distillbert which is not the best model we can find out there). With all these in mind, we want to train .Transformers models on our own data with the models that we prefer.

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

Related tags

Overview

Sentiment-of-movie-reviews

Owner

Shared code for training sentence embeddings with Flax / JAX

Finetune gpt-2 in google colab

Natural language Understanding Toolkit

Japanese Long-Unit-Word Tokenizer with RemBertTokenizerFast of Transformers

To be a next-generation DL-based phenotype prediction from genome mutations.

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

DLO8012: Natural Language Processing & CSL804: Computational Lab - II

Simple program that translates the name of files into English

Wind Speed Prediction using LSTMs in PyTorch

A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

Blackstone is a spaCy model and library for processing long-form, unstructured legal text

Language-Agnostic SEntence Representations

A single model that parses Universal Dependencies across 75 languages.

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

This repository structures data in title, summary, tags, sentiment given a fragment of a conversation

⚖️ A Statutory Article Retrieval Dataset in French.

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2