The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news.

Overview

Fake News Detection

Overview

The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news. However, it is difficult to understand how deep learning models make decisions on what is fake or real news, and furthermore these models are vulnerable to adversarial attacks. In this project, we test the resilience of a fake news detector against a set of adversarial attacks. Our results indicate that a deep learning model remains vulnerable to adversarial attacks, but also is alarmingly vulnerable to the use of generic attacks: the inclusion of certain sequences of text whose inclusion into nearly any text sample can cause it to be misclassified. We explore how this set of generic attacks against text classifiers can be detected, and explore how future models can be made more resilient against these attacks.

Dataset Description

Our fake news model and dataset are taken from this github repo.

  • train.csv: A full training dataset with the following attributes:

    • id: unique id for a news article
    • title: the title of a news article
    • author: author of the news article
    • text: the text of the article; could be incomplete
    • label: a label that marks the article as potentially unreliable
      • 1: unreliable
      • 0: reliable
  • test.csv: A testing training dataset with all the same attributes at train.csv without the label.

Adversarial Text Generation

It's difficult to generate adversarial samples when working with text, which is discrete. A workaround, proposed by J. Gao et al. has been to create small text perturbations, like misspelled words, to create a black-box attack on text classification models. Another method taken by N. Papernot has been to find the gradient based off of the word embeddings of sample text. Our approach uses the algorithm proposed by Papernot to generate our adversarial samples. While Gao’s method is extremely effective, with little to no modification of the meaning of the text samples, we decided to see if we could create valid adversarial samples by changing the content of the words, instead of their text.

Methodology

Our original goal was to create a model that could mutate text samples so that they would be misclassified by the model. We accomplished this by implementing the algorithm set out by Papernot in Crafting Adversarial Input Sequences. The proposed algorithm generates a white-box adversarial example based on the model’s Jacobian matrix. Random words from the original text sample are mutated. These mutations are determined by finding a word in the embedding where the sign of the difference between the original word and the new word are closest to the sign of the Jacobian of the original word. The resulting words have an embedding direction that most closely resemble the direction indicated as being most impactful according to the model’s Jacobian.

A fake news text sample modified to be classified as reliable is shown below:

Council of Elders Intended to Set Up Anti-ISIS Coalition by Jason Ditz, October said 31, 2016 Share This ISIS has killed a number of Afghan tribal elders and wounded several more in Nangarhar Province’s main city of Jalalabad today, with a suicide bomber from the group targeting a meeting of the council of elders in the city. The details are still scant, but ISIS claims that the council was established in part to discuss the formation of a tribal anti-ISIS coalition in the area. They claimed 15 killed and 25 wounded, labeling the victims “apostates.” Afghan 000 government officials put the toll a lot lower, saying only four were killed and seven mr wounded in the attack. Nangarhar is the main base of operations for ISIS forces in Afghanistan, though they’ve recently begun to pop up around several other provinces. Whether the council was at the point of establishing an anti-ISIS coalition or not, this is in keeping with the group mr's reaction to any sign of growing local resistance, with ISIS having similarly made an example of tribal groups in Iraq and Syria during their establishment there. Last 5 posts by Jason Ditz

We also discovered a phenomena where adding certain sequences of text to samples would cause them to be misclassified without needing to make any additional modifications to the original text. To discover additional sequences, we took three different approaches: generating sequences based on the sentiments of the word bank, using Papernot’s algorithm to append new sequences, and creating sequences by hand.

Modified Papernot

Papernot’s original algorithm had been trained to mutate existing words in an input text to generate the adversarial text. However, our LSTM model pads the input, leaving spaces for blank words when the input length is small enough. We modify Papernot’s algorithm to mutate on two “blank” words at the end of our input sequence. This will generate new sequences of text that can then be applied to other samples, to see if they can serve as generic attacks.

The modified Papernot algorithm generated two-word sequences of the words ‘000’, ‘said’, and ‘mr’ in various orders, closely resembling the word substitutions created by the baseline Papernot algorithm. It can be expected that the modified Papernot will still use words identified by the baseline method, given that both models rely on the model’s Jacobian matrix when selecting replacement words. When tested against all unreliable samples, sequences generated are able to shift the model’s confidence to inaccurately classify a majority of samples as reliable instead.

Handcraft

Our simplest approach to the generation was to manually look for sequences of text by hand. This involved looking at how the model had performed on the training set, how confident it was on certain samples, and looking for patterns in samples that had been misclassified. We tried to rely on patterns that appear to a human observer to be innocuous, but also explored other patterns that would change the meaning of the text in significant ways.

Methodology Sample Sequence False Discovery Rate
Papernot mr 000 0.37%
Papernot said mr 29.74%
Handcraft follow twitter 26.87%
Handcraft nytimes com 1.70%

Conclusion

One major issue with the deployment of deep learning models is that "the ease with which we can switch between any two decisions in targeted attacks is still far from being understood." It is primarily on this basis that we are skeptical of machine learning methods. We believe that there should be greater emphasis placed on identifying the set of misclassified text samples when evaluating the performance of fake news detectors. If seemingly minute perturbations in the text can change the entire classification of the sample, it is likely that these weaknesses will be found by fake news distributors, where the cost of producing fake news is cheaper than the cost of detecting it.

Our project also led to the discovery of the existence of a set of sequences that could be applied to nearly any text sample to then be misclassified by the model, resembling generic attacks from the cryptography field. We proposed a modification of Papernot’s Jacobian-based adversarial attack to automatically identify these sequences. However, some of these generated sequences do not feel natural to the human eye, and future work can be placed into improving their generation. For now, while the eyes of a machine may be tricked by our samples, the eyes of a human can still spot the differences.

References

Owner
Kushal Shingote
Android Developer📱📱 iOS Apps📱📱 Swift | Xcode | SwiftUI iOS Swift development📱 Kotlin Application📱📱 iOS📱 Artificial Intelligence 💻 Data science
Kushal Shingote
Precision Medicine Knowledge Graph (PrimeKG)

PrimeKG Website | bioRxiv Paper | Harvard Dataverse Precision Medicine Knowledge Graph (PrimeKG) presents a holistic view of diseases. PrimeKG integra

Machine Learning for Medicine and Science @ Harvard 103 Dec 10, 2022
NeMo: a toolkit for conversational AI

NVIDIA NeMo Introduction NeMo is a toolkit for creating Conversational AI applications. NeMo product page. Introductory video. The toolkit comes with

NVIDIA Corporation 5.3k Jan 04, 2023
A music comments dataset, containing 39,051 comments for 27,384 songs.

Music Comments Dataset A music comments dataset, containing 39,051 comments for 27,384 songs. For academic research use only. Introduction This datase

Zhang Yixiao 2 Jan 10, 2022
Codes to pre-train Japanese T5 models

t5-japanese Codes to pre-train a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts. The model is available at https://hug

Megagon Labs 37 Dec 25, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

🤗 Contributing to OpenSpeech 🤗 OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 03, 2023
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

CRNN paper:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 1. create your ow

Tsukinousag1 3 Apr 02, 2022
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 07, 2023
Code for text augmentation method leveraging large-scale language models

HyperMix Code for our paper GPT3Mix and conducting classification experiments using GPT-3 prompt-based data augmentation. Getting Started Installing P

NAVER AI 47 Dec 20, 2022
Trains an OpenNMT PyTorch model and SentencePiece tokenizer.

Trains an OpenNMT PyTorch model and SentencePiece tokenizer. Designed for use with Argos Translate and LibreTranslate.

Argos Open Tech 61 Dec 13, 2022
Prompt tuning toolkit for GPT-2 and GPT-Neo

mkultra mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo. Prompt tuning injects a string of 20-100 special tokens into the context in order to

61 Jan 01, 2023
BERT Attention Analysis

BERT Attention Analysis This repository contains code for What Does BERT Look At? An Analysis of BERT's Attention. It includes code for getting attent

Kevin Clark 401 Dec 11, 2022
A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"

A sample Python project A sample project that exists as an aid to the Python Packaging User Guide's Tutorial on Packaging and Distributing Projects. T

Python Packaging Authority 4.5k Dec 30, 2022
GVT is a generic translation tool for parts of text on the PC screen with Text to Speak functionality.

GVT is a generic translation tool for parts of text on the PC screen with Text to Speech functionality. I wanted to create it because the existing tools that I experimented with did not satisfy me in

Nuked 1 Aug 21, 2022
Text editor on python tkinter to convert english text to other languages with the help of ployglot.

Transliterator Text Editor This is a simple transliteration program which is used to convert english word to phonetically matching word in another lan

Merin Rose Tom 1 Jan 16, 2022
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t

Kenneth Enevoldsen 32 Dec 29, 2022
🧪 Cutting-edge experimental spaCy components and features

spacy-experimental: Cutting-edge experimental spaCy components and features This package includes experimental components and features for spaCy v3.x,

Explosion 65 Dec 30, 2022
Lingtrain Aligner — ML powered library for the accurate texts alignment.

Lingtrain Aligner ML powered library for the accurate texts alignment in different languages. Purpose Main purpose of this alignment tool is to build

Sergei Averkiev 76 Dec 14, 2022
Sample data associated with the Aurora-BP study

The Aurora-BP Study and Dataset This repository contains sample code, sample data, and explanatory information for working with the Aurora-BP dataset

Microsoft 16 Dec 12, 2022
Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

NLP-Summarizer Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5 This project aimed to provide in

Samuel Sharkey 1 Feb 07, 2022
Demo programs for the Talking Head Anime from a Single Image 2: More Expressive project.

Demo Code for "Talking Head Anime from a Single Image 2: More Expressive" This repository contains demo programs for the Talking Head Anime

Pramook Khungurn 901 Jan 06, 2023