Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Last update: Aug 04, 2022

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

download the 'resources.zip' file here: https://drive.google.com/file/d/1X88cMrLVpAcJd5Z4Gg6MfTLclIuGF-d6/view?usp=sharing
extract the content of 'resources.zip'

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Execute the following command to train and evaluate the model. The evaluation results are saved under the folder 'results'.

python main.py -c config.json

Optimizing Hyperparameters

The "config.json" file contains hyperparameters that can be changed to train different variants of the model.

{
  "base_dir": "",
  "batch_size": 64,
  "epochs": 20,
  "epoch_patience": 5,
  "bert_model_dir": "resources/hatebert",
  "monitor": "loss",
  "tweet_text_seq_len": 80,
  "tweet_text_char_len": 128,
  "char_size": 29,
  "max_learning_rate": 0.001,
  "end_learning_rate": 0.0000001,
  "rnn_type": "lstm",
  "rnn_layer_size": 200,
  "text_models": ["char_emb", "bert", "hate_words"],
  "normalize_text": true,
  "dataset_year": "2021",
  "optimizer": "adam",
  "text_use_attention": false,
  "oversample": true,
  "feature_normalization_layer_size": 512,
  "min_feature_normalization_layer_size": 64
}

bert_model_dir

"bert_model_dir": "resources/hatebert"
     OR
"bert_model_dir": "resources/bert-base"

dataset_year

"dataset_year": "2019"
	OR
"dataset_year": "2020"
	OR
"dataset_year": "2021"

text_models

"text_models": ["hate_words"]
	OR
"text_models": ["bert"]
	OR
"text_models": ["char_emb"]
	OR
"text_models": ["char_emb", "bert", "hate_words"]

rnn_type

"rnn_type": "lstm"
	OR
"rnn_type": "gru"
	OR
"rnn_type": "bi-gru"

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Related tags

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Optimizing Hyperparameters

Owner

Sherzod Hakimov

DvD-TD3: Diversity via Determinants for TD3 version

Self-Supervised Contrastive Learning of Music Spectrograms

Out-of-boundary View Synthesis towards Full-frame Video Stabilization

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

This project aims at providing a concise, easy-to-use, modifiable reference implementation for semantic segmentation models using PyTorch.

Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

Self-supervised spatio-spectro-temporal represenation learning for EEG analysis

A library for researching neural networks compression and acceleration methods.

Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

A benchmark dataset for mesh multi-label-classification based on cube engravings introduced in MeshCNN

Integrated physics-based and ligand-based modeling.

Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

ZeroVL - The official implementation of ZeroVL

Differentiable simulation for system identification and visuomotor control

A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

The source code of CVPR17 'Generative Face Completion'.