Skip to yearly menu bar Skip to main content


Poster

OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset

Allen Roush · Yusuf Shabazz · Arvind Balaji · Peter Zhang · Stefano Mezza · Markus Zhang · Sanjay Basu · Sriram Vishwanath · Ravid Shwartz-Ziv


Abstract:

We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence. OpenDebateEvidence captures the complexity of arguments in high school and college debates, providing valuable resources for training and evaluation. By incorporating regular season evidence, it offers a larger, more representative, and diverse set of argumentative texts compared to existing datasets. We conducted extensive evaluations and fine-tuning experiments on popular language models using this dataset, revealing significant insights into their capabilities and limitations in handling argumentative text. Our results show that models fine-tuned on OpenDebateEvidence demonstrated substantial performance improvements on other argumentative datasets, underscoring the dataset's superiority. OpenDebateEvidence is publicly available to support further research and innovation in computational argumentation. Access it here: https://huggingface.co/datasets/Yusuf5/OpenCaselist

Live content is unavailable. Log in and register to view live content