Affinity Workshop: LatinX in AI (LXAI) Research @ NeurIPS 2021

Curating the Twitter Election Integrity Datasets forBetter Online Troll Characterization

Albert Orozco Camacho · Reihaneh Rabbany


In modern days, social media platforms provide accessible channels for the inter-action and immediate reflection of the most important events happening around the world. In this paper, we, firstly, present a curated set of datasets whose origin stem from the Twitter’s Information Operations efforts. More notably, these accounts, which have been already suspended, provide a notion of how state-backed human trolls operate.Secondly, we present detailed analyses of how these behaviours vary over time,and motivate its use and abstraction in the context of deep representation learning:for instance, to learn and, potentially track, troll behaviour. We present baselinesf or such tasks and highlight the differences there may exist within the literature.Finally, we utilize the representations learned for behaviour prediction to classify trolls from"real"users, using a sample of non-suspended active accounts.