A number of methods in order to perform Natural Language Processing on live data derived from Twitter

Last update: Nov 24, 2021

Related tags

Overview

Twitter_NLP

Link to Project: https://twitoff-amadou.herokuapp.com/

==Description==

This project integrates a number of methods in order to perform Natural Language Processing (NLP) on live data derived from Twitter. The goal of this project is to demonstrate how NLP can be used at a basic level to classify hypertext by which Twitter user is most likely to 'tweet' (or post) it. For this project, Twitter API access had been granted, and implemented with the Tweepy wrapper for python.

To start, the web app it built using the Flask platform and is deployed on Heroku. For the functionality of the project, data is extracted from Twitter using its API and the Tweepy library and is fed into SQLAlchemy tables. These tables which hold a variety of information we're concerned with, such as the usernames and past tweeting data, are integrated with our PostgreSQL database. The Spacy library is then responsible for vectorizing our tweets into components our models can operate on. Finally, a random forest classifier is tasked with receiving and training on these vectors.

The interface of the app is quite intuitive. There are two text boxes, one labeled "User to add" and the other, "Tweet text to predict". The user is expected to type a name into the 'add' box, such that Tweepy can add the respective twitter user(s) and their tweeting data to our PostgreSQL database. Our random forest will then train live on the inputted values. Once this has been accomplished with at least two Twitter users in the database, one can add text into the 'predict' box, select the two users they wish to compare and let our model produce a result.

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

Related tags

Overview

Twitter_NLP

==Description==

Owner

NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

A BERT-based reverse dictionary of Korean proverbs

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

CCKS-Title-based-large-scale-commodity-entity-retrieval-top1

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA

Harvis is designed to automate your C2 Infrastructure.

Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

Stuff related to Ben Eater's 8bit breadboard computer

STonKGs is a Sophisticated Transformer that can be jointly trained on biomedical text and knowledge graphs

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Protein Language Model

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Text editor on python to convert english text to malayalam(Romanization/Transiteration).

Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

Espial is an engine for automated organization and discovery of personal knowledge

List of GSoC organisations with number of times they have been selected.

Conversational-AI-ChatBot - Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users!

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Python wrapper for Stanford CoreNLP tools v3.4.1

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch