Affinity Workshop: LatinX in AI (LXAI) Research @ NeurIPS 2021

A Pharmacovigilance Application of Social Media Mining: An Ensemble Approach for Automated Classification and Extraction of Drug Mentions in Tweets

Luis Alberto Robles Hernandez · Juan Banda


Researchers have extensively used social media platforms like Twitter for knowledge discovery purposes, as tweets are considered a wealth of information that provides unique insights. Recent developments have further enabled social media mining for various biomedical tasks such as pharmacovigilance. A first step towards identifying a use-case of Twitter for the pharmacovigilance domain is to extract medication/drug terminologies mentioned in the tweets, which is a challenging task due to several reasons. For example, drug mentions in tweets may be incorrectly written, making the identification of these mentions more difficult. In this work, we propose a two step approach, first, we focused on classifying tweets with drug mentions via an ensemble model (containing transformer models), second, we extract drug mentions (along with their span positions) using a text-tagging/dictionary based approach, and a Named Entity Recognition (NER) approach. By comparing these two entity identification approaches, we demonstrate that using only a dictionary-based approach is not enough.