NeurIPS 2023

Workshop

Prompt Risk Control: A Rigorous Framework for Responsible Deployment of Large Language Models
Thomas Zollo · Todd Morrill · Zhun Deng · Jake Snell · Toniann Pitassi · Richard Zemel

Workshop

AVIS: Autonomous Visual Information Seeking with Large Language Model Agent
Ziniu Hu

Workshop

Disclosing the Biases in Large Language Models via Reward Based Questioning
Ezgi Korkmaz

Workshop

Self-Select: Optimizing Instruction Selection for Large Language Models
Keshav Ramji · Alexander Kyimpopkin

Workshop

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Alexander Robey · Eric Wong · Hamed Hassani · George J. Pappas

Workshop

Automatic Construction of a Korean Toxic Query Dataset for Ethical Tuning of Large Language Models
SungJoo Byun · Dongjun Jang · Hyemi Jo · HYOPIL SHIN

Workshop

Self-Select: Optimizing Instruction Selection for Large Language Models
Keshav Ramji · Alexander Kyimpopkin

Workshop

AVIS: Autonomous Visual Information Seeking with Large Language Model Agent
Ziniu Hu

Workshop

Investigating Hiring Bias in Large Language Models
Akshaj Kumar Veldanda · Fabian Grob · Shailja Thakur · Hammond Pearce · Benjamin Tan · Ramesh Karri · Siddharth Garg

Workshop

Citation: A Key to Building Responsible and Accountable Large Language Models
Jie Huang · Kevin Chang

Workshop

Sat 12:01

DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models
Albert Garde · Esben Kran · Fazl Barez

Workshop

Jailbreaking Black Box Large Language Models in Twenty Queries
Patrick Chao · Alexander Robey · Edgar Dobriban · Hamed Hassani · George J. Pappas · Eric Wong

Main Navigation

778 Results