firstbacksecondback
2 Results
Poster
|
Wed 16:30 |
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers Gavia Gray · aman tiwari · Shane Bergsma · Joel Hestness |
|
Workshop
|
Beyond the Binary: Capturing Diverse Preferences With Reward Regularization Vishakh Padmakumar · Chuanyang Jin · Hannah Rose Kirk · He He |