The world is not static: This causes real-world time series to change over time because external, and potentially disruptive, events such as macroeconomic cycles or the COVID-19 pandemic change the underlying factors that influence the time series. Once such a data distribution shift happens, it will be part of the time series history and impact future forecasting attempts. We present an adaptive sampling strategy that selects the part of the history that is relevant for the recent data distribution. We achieve this by learning a discrete distribution over relevant time steps by Bayesian optimization. We instantiate this idea with a two-step, model-agnostic method that is pre-trained with uniform sampling and then training a lightweight adaptive architecture with adaptive sampling. We show with synthetic and real-world experiments that this method adapts to distribution shift and reduces the forecasting error of the base model by 8.4%.