Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

Aug 28, 2020

Md Abul Bashar, Richi Nayak

Figure 1 for QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

Figure 2 for QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

Figure 3 for QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

Figure 4 for QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

Share this with someone who'll enjoy it:

Abstract:We describe our top-team solution to Task 1 for Hindi in the HASOC contest organised by FIRE 2019. The task is to identify hate speech and offensive language in Hindi. More specifically, it is a binary classification problem where a system is required to classify tweets into two classes: (a) \emph{Hate and Offensive (HOF)} and (b) \emph{Not Hate or Offensive (NOT)}. In contrast to the popular idea of pretraining word vectors (a.k.a. word embedding) with a large corpus from a general domain such as Wikipedia, we used a relatively small collection of relevant tweets (i.e. random and sarcasm tweets in Hindi and Hinglish) for pretraining. We trained a Convolutional Neural Network (CNN) on top of the pretrained word vectors. This approach allowed us to be ranked first for this task out of all teams. Our approach could easily be adapted to other applications where the goal is to predict class of a text when the provided context is limited.

* CEUR Workshop Proceedings. Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation. Vol. 2517. Sun SITE Central Europe, Germany, pp. 237-245

View paper on

Share this with someone who'll enjoy it:

Title:QutNocturnal@HASOC'19: CNN for Hate Speech and Offensive Content Identification in Hindi Language

Paper and Code