firstbacksecondback
14 Results
Affinity Event
|
Benchmark on Peer Review Toxic Detection: A Challenging Task with a New Dataset Man Luo · Bradley Peterson · Rafael Gan · Hari Ramalingame · Navya Gangrade · Ariadne Dimarogona · Imon Banerjee · Phillip Howard |
||
Poster
|
Wed 16:30 |
BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text Siyan Wang · Bradford Levy |
|
Poster
|
Fri 11:00 |
UniTox: Leveraging LLMs to Curate a Unified Dataset of Drug-Induced Toxicity from FDA Labels Jacob Silberg · Kyle Swanson · Elana Simon · Angela Zhang · Zaniar Ghazizadeh · Scott Ogden · Hisham Hamadeh · James Zou |
|
Poster
|
Thu 16:30 |
Toxicity Detection for Free Zhanhao Hu · Julien Piet · Geng Zhao · Jiantao Jiao · David Wagner |
|
Poster
|
Fri 16:30 |
Soft-Label Integration for Robust Toxicity Classification Zelei Cheng · Xian Wu · Jiahao Yu · Shuo Han · Xin-Qiang Cai · Xinyu Xing |
|
Workshop
|
The effect of fine-tuning on language model toxicity Will Hawkins · Brent Mittelstadt · Chris Russell |
||
Workshop
|
The effect of fine-tuning on language model toxicity Will Hawkins · Brent Mittelstadt · Chris Russell |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: A Mechanistic Analysis of Toxicity Reduction Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: Attributing Toxicity Reduction to Neurons Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Ablation is Not Enough to Emulate DPO: A Mechanistic Analysis of Toxicity Reduction Yushi Yang · Filip Sondej · Harry Mayne · Adam Mahdi |
||
Workshop
|
Sat 15:30 |
Keynote 4: TextAttack for Improving Toxicity Detectors’ Adversarial Robustness Yanjun Qi · Yanjun Qi |
|
Workshop
|
Sun 14:10 |
The effect of fine-tuning on language model toxicity |