No-Regret Contextual Bandits for Cost-Sensitive Decision-Making
Rohini Das
Abstract
Public agencies like city governments face sequential, cost-sensitive choices under partial feedback, for example, deciding whether to \emph{inspect} or \emph{not inspect} a construction permit given categorical descriptors, spatial coordinates, and stage metadata. There are operational costs of doing inspections and thus doing all possible inspections is excessively costly. We frame this as a contextual bandit problem and ask: Can regret-minimizing online policies reduce cumulative cost without much hyperparameter tuning when conditions drift or become strategic?
Successful Page Load