Skip to yearly menu bar Skip to main content


Spotlight
in
Workshop: Instruction Tuning and Instruction Following

Oral Presentations


Abstract:
  1. Understanding Hidden Context in Preference Learning: Consequences for RLHF
  2. Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
  3. Understanding the Effects of RLHF on LLM Generalisation and Diversity
  4. Learning Interactive Real-World Simulators
  5. Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks
  6. Self-RAG: Self-reflective Retrieval Augmented Generation
  7. Delve into PPO: Implementation Matters for Stable RLHF
  8. FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

Chat is not available.