Community and sentiment analysis based on tweets

Overview

Social Media Analytics project

Community and sentiment analysis based on tweets

The project has set itself the goal of analyzing the thoughts and interaction of Italian users through the social posts expressed through the Twitter platform on the day of the entry into force of the new measures. In particular, we want to research the reference hubs present on the network, but also the sentiment and emotions of peoples with respect to the new limitations.

Motivation

One of the hottest topics in Italy in the last months of 2021 concerns the introduction of the Super Green Pass to access indoor clubs, events, gyms, etc. This security measure entered into force on 6 December 2021 and in fact no longer allows access to various services to those who have not completed the vaccination cycle. For these reasons it was decided, for the development of the project, to analyze the impressions of the Italian Twitter community regarding the Super Green Pass, with the aim of understanding who are the users who write and interact on the platform and if there are specific communities among the users who have commented on the introduction of this extension. We also want to analyze the possible influencing nodes of the network and verify the sentiment around them.

Data

The data was collected by Twitter using their API and Tweepy python package. All tweets were written on December 6th in italian languages.
In data folder you can find the .csv file with all the collected tweet (here), and you can also find two extras files that contains the sentiment extracted for each tweet (here) and the aggregated sentiment per cluster (here).

Files

All the developed code is present in the file Code.ipynb. You can also find the report and presentation made for the exam. Both in italian language.

How to run code?

We advise you to run all the code in Google Colaboratory platform. All notebooks all already setted to import the necessary packages! If you have any doubt please feel free to contact me!

Graph visualization

In Pyvis_export folder you can find two exported interactive visualization of the network graph. You can also find a static version of the images in .jpg files if you want to see them quickly (html version is quite slow at opening).

Results

We have found that hubs are not famous people, this may be an expected result due to the particular context of the no-vax discussion. In this context, the ideas and contents are more important than the celebrity of the person.
Focusing on sentiment analysis we noticed that the vast majority of tweets are neutral or negative! This is a far cry from the reality where most people have been vaccinated and are not that disappointed with the new rules.

About us

Riccardo Confalonieri - Data Science Student @ University of Milano-Bicocca

Justin Armanini - Data Science Student @ University of Milano-Bicocca

Chiara Cormio - Data Science Student @ University of Milano-Bicocca

Owner
Computer Science Bachelor @ Università degli Studi Milano Bicocca. DataScience Student @ Università degli Studi Milano Bicocca.
A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.

blurr A library that integrates huggingface transformers with version 2 of the fastai framework Install You can now pip install blurr via pip install

ohmeow 253 Dec 31, 2022
Concept Modeling: Topic Modeling on Images and Text

Concept is a technique that leverages CLIP and BERTopic-based techniques to perform Concept Modeling on images.

Maarten Grootendorst 120 Dec 27, 2022
🌐 Translation microservice powered by AI

Dot Translate 🌐 A microservice for quick and local translation using A.I. This service starts a local webserver used for neural machine translation.

Dot HQ 48 Nov 22, 2022
Speach Recognitions

easy_meeting Добро пожаловать в интерфейс сервиса автопротоколирования совещаний Easy Meeting. Website - http://cf5c-62-192-251-83.ngrok.io/ Принципиа

Maksim 3 Feb 18, 2022
Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Tokenizer Le Tokenizer est un analyseur lexicale, il permet, comme Flex and Yacc par exemple, de tokenizer du code, c'est à dire transformer du code e

Manolo 1 Aug 15, 2022
The guide to tackle with the Text Summarization

The guide to tackle with the Text Summarization

Takahiro Kubo 1.2k Dec 30, 2022
Natural Language Processing library built with AllenNLP 🌲🌱

Custom Natural Language Processing with big and small models 🌲🌱

Recognai 65 Sep 13, 2022
A Paper List for Speech Translation

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

138 Dec 24, 2022
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

KoGPT KoGPT (Korean Generative Pre-trained Transformer) https://github.com/kakaobrain/kogpt https://huggingface.co/kakaobrain/kogpt Model Descriptions

Kakao Brain 797 Dec 26, 2022
A library for end-to-end learning of embedding index and retrieval model

Poeem Poeem is a library for efficient approximate nearest neighbor (ANN) search, which has been widely adopted in industrial recommendation, advertis

54 Dec 21, 2022
Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

TextCortex - HemingwAI Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingw

TextCortex AI 27 Nov 28, 2022
Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

efficient-task-transfer This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection".

AdapterHub 26 Dec 24, 2022
This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

The baseline code is for EDA: Easy Data Augmentation techniques for boosting performance on text classification tasks

Akbar Karimi 81 Dec 09, 2022
This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

Timo Schick 154 Jan 01, 2023
Lumped-element impedance calculator and frequency-domain plotter.

fastZ: Lumped-Element Impedance Calculator fastZ is a small tool for calculating and visualizing electrical impedance in Python. Features include: Sup

Wesley Hileman 47 Nov 18, 2022
Negative sampling for solving the unlabeled entity problem in NER. ICLR-2021 paper: Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.

Negative Sampling for NER Unlabeled entity problem is prevalent in many NER scenarios (e.g., weakly supervised NER). Our paper in ICLR-2021 proposes u

Yangming Li 128 Dec 29, 2022
Fine-tune GPT-3 with a Google Chat conversation history

Google Chat GPT-3 This repo will help you fine-tune GPT-3 with a Google Chat conversation history. The trained model will be able to converse as one o

Nate Baer 7 Dec 10, 2022
An end to end ASR Transformer model training repo

END TO END ASR TRANSFORMER 本项目基于transformer 6*encoder+6*decoder的基本结构构造的端到端的语音识别系统 Model Instructions 1.数据准备: 自行下载数据,遵循文件结构如下: ├── data │ ├── train │

旷视天元 MegEngine 10 Jul 19, 2022
Subtitle Workshop (subshop): tools to download and synchronize subtitles

SUBSHOP Tools to download, remove ads, and synchronize subtitles. SUBSHOP Purpose Limitations Required Web Credentials Installation, Configuration, an

Joe D 4 Feb 13, 2022