Expo Talk Panel
GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection
Melissa Kazemi Rad
Upper Level Room 30A-E
The rapid integration of LLMs for synthetic data generation offers a powerful solution for data scarcity, a critical issue for training machine learning models. However, the resulting data quality is often questionable, requiring costly and time-consuming manual review. This oversight is especially challenging in the crucial domain of trustworthy and safe AI.
A significant need here is the automated identification of adversarial and harmful inputs, a process known as red-teaming, to improve guardrailing systems before deployment. Developing an effective, fully automated red-teaming approach capable of generating diverse, out-of-domain harmful content has been a long-standing challenge.
To address data scarcity in harmful text detection & the challenge of automated red-teaming, we introduce GRAID (Geometric & Reflective AI-Driven Data Augmentation), a novel, versatile, & dynamic multi-agent framework.
GRAID operates in two stages:
1) Geometric Generation: A constrained, fine-tuned LLM generates geometrically controlled examples to diversify content within the original embedding space.x000D
2) Reflective Augmentation: A multi-agentic reflective process promotes stylistic diversity & uncovers difficult edge cases.
This combination ensures both reliable coverage of the input space & nuanced exploration of harmful content. We demonstrate that augmenting a harmful text classification dataset with GRAID significantly improves the performance of downstream real-world guardrail models. Furthermore, GRAID captures data variability in new geometric domains while preserving data relationships.
While initially focused on harmful text detection, GRAID’s modular design makes it an inherently domain-agnostic framework adaptable to various applications beyond classification. In this talk, we will detail how GRAID distinguishes itself from existing solutions, discuss its building blocks, & share insights on its easy adaptation for diverse synthetic data generation needs.
Live content is unavailable. Log in and register to view live content