In typical experimentation paradigms, reallocating measurement effort incurs high operational costs due to delayed feedback, and infrastructural and organizational difficulties. Challenges in reallocation lead practitioners to employ a few reallocation epochs in which outcomes are measured in large batches. Standard adaptive experimentation methods, however, do not scale to these regimes as they are tailored to perform well as the number of reallocation epochs grows. We develop a new adaptive experimentation framework that can flexibly handle any batch size and learns near-optimal designs when reallocation opportunities are few. By deriving an asymptotic sequential experiment based on normal approximations, we formulate a Bayesian dynamic program that can leverage prior information based on previous experiments. We propose policy gradient-based lookahead policies and find that despite relying on approximations, our methods greatly improve statistical power over uniform allocation and standard adaptive policies.
Ethan Che (Columbia Business School)
Hongseok Namkoong (Columbia University)
More from the Same Authors
2022 Workshop: Workshop on Distribution Shifts: Connecting Methods and Applications »
Chelsea Finn · Fanny Yang · Hongseok Namkoong · Masashi Sugiyama · Jacob Eisenstein · Jonas Peters · Rebecca Roelofs · Shiori Sagawa · Pang Wei Koh · Yoonho Lee
2021 Workshop: Distribution shifts: connecting methods and applications (DistShift) »
Shiori Sagawa · Pang Wei Koh · Fanny Yang · Hongseok Namkoong · Jiashi Feng · Kate Saenko · Percy Liang · Sarah Bird · Sergey Levine