Large-scale deployed learning systems are often evaluated alongmultiple objectives or criteria. But, how can we learn or optimizesuch complex systems, with potentially conflicting or evenincompatible objectives? How can we improve the system when user feedback becomes available, feedback possibly alerting to issues not previously optimized for by the system?We present a new theoretical model for learning and optimizing suchcomplex systems. Rather than committing to a static or pre-definedtradeoff for the multiple objectives, our model is guided by thefeedback received, which is used to update its internal state.Our model supports multiple objectives that can be of very generalform and takes into account their potential incompatibilities.We consider both a stochastic and an adversarial setting. In thestochastic setting, we show that our framework can be naturally castas a Markov Decision Process with stochastic losses, for which we giveefficient vanishing regret algorithmic solutions. In the adversarialsetting, we design efficient algorithms with competitive ratioguarantees.We also report the results of experiments with our stochasticalgorithms validating their effectiveness.