Fake news detection

Implements a fake news detection program using classifiers for Data Mining course at UoA.

Description

The project is the categorization of text data by news articles and specifically the detection of fake news. The data contains 2 files in csv format (Fake.csv, True.csv)

Data Preprocessing

Removed punctuation and made all letters uniform after dropped every null row

Feature Extraction

To analyse the preprocessed data it has to be represented in a numeric format by using:

Bag of Words - one of the simplest word embedding approaches
TF-IDF is a bag words that applies a regularization algorithm.
Word vectors from Word2Vec model to create a vector representation for a sentence.

Classifiers

For every of the following classifiers there is a detailed analysis in the pytorch file

Logistic Regression
Naive Bayes
Support Vector Machine
Random Forests
Voting Classifier

Metrics

We evaluate performance of each method in test data using the following evaluation metrics:

Accuracy score
F1 score which is the weighted average of precision and recall and thus it is used especially for uneven class distribution problems.

Contributors

Apostolos Karvelas

Ioannis Papadimitriou

Implements a fake news detection program using classifiers.

Related tags

Overview

Fake news detection

Description

Data Preprocessing

Feature Extraction

Classifiers

Metrics

Contributors

Owner

Apostolos Karvelas

MMDetection3D is an open source object detection toolbox based on PyTorch

一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

A toy compiler that can convert Python scripts to pickle bytecode 🥒

Code for the paper: Sketch Your Own GAN

Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.

pip install python-office

BERTMap: A BERT-Based Ontology Alignment System

An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.

Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

This repository gives an example on how to preprocess the data of the HECKTOR challenge

CC-GENERATOR - A python script for generating CC

Flexible time series feature extraction & processing

Generate image analogies using neural matching and blending

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

converts nominal survey data into a numerical value based on a dictionary lookup.

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

[ICCV' 21] "Unsupervised Point Cloud Pre-training via Occlusion Completion"

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)