NeurIPS Poster [Re] Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification

Poster

[Re] Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification

Priyanka Bose · Chandra Shekhar Pandey · Fraida Fund

Great Hall & Hall B1+B2 (level 1) #2006

[ Abstract ]

[ Poster] [ OpenReview]

Abstract:

Reproducibility SummaryScope of Reproducibility — We aim to reproduce a result from the paper “Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification” [1]. Our study is restricted specifically to the claim that the use of swear words impacts hate speech classification of AAE text. We were able to broadly validate the claim of the paper, however, the magnitude of the effect was dependent on the word replacement strategy, which was somewhat ambiguous in the original paper.Methodology — The authors did not publish source code. Therefore, we reproduce the experiments by following the methodology described in the paper. We train BERT models from TensorFlow Hub [2] to classify hate speech using the DWMW17[3] and FDCL18[4]Twitter datasets. Then, we compile a dictionary of swear words and replacement words with comparable meaning, and we use this to create “censored” versions of samples in Blodgett et al.’s[5] AAE Twitter dataset. Using the BERT models, we evaluate the hate speech classification of the original data and the censored data. Our experiments are conducted on an open‐access research testbed, Chameleon [6], and we make available both our code and instructions for reproducing the result on the shared facility.Results — Our results are consistent with the claim that the censored text (without swear words) is less often classified as hate speech, offensive, or abusive than the same text with swear words. However, we find the classification is very sensitive to the word replacement dictionary being used.What was easy — The authors used three well known datasets which were easy to obtain. They also used well‐known widely available models, BERT and Word2Vec.What was difficult — Some of the details of model training were not fully specified. Also, we were not able to exactly re‐create a comparable dictionary of swear words and their replacement terms of similar meanings, following their methodology.Communication with original authors — We reached out to the authors, but were not able to communicate with them before the submission of this report.

Chat is not available.