Statistical risk assessments inform consequential decisions such as pretrial release in criminal justice, and loan approvals in consumer finance.Such risk assessments make counterfactual predictions, predicting the likelihood of an outcome under a proposed decision (e.g., what would happen if we approved this loan?).A central challenge is that there may have been unobserved confounders that jointly affected past decisions and outcomes in the historical data. We propose a tractable mean outcome sensitivity model that bounds the extent to which unmeasured confounders could affect outcomes on average. The mean outcome sensitivity model partially identifies the conditional likelihood of the outcome under the proposed decision as well as popular predictive performance metrics (accuracy, calibration, TPR, FPR, etc.) and commonly-used predictive disparities, and we derive their sharp identified sets.We then solve three tasks that are essential to deploying statistical risk assessments in high-stakes settings.First, we propose a learning procedure based on doubly-robust pseudo-outcomes that estimates bounds on the conditional likelihood of the outcome under the proposed decision, and derive a bound on its integrated mean square error.Second, we show how our estimated bounds on the conditional likelihood of the outcome under the proposed decision can be translated into a robust, plug-in decision-making policy, and derive bounds on its worst-case regret relative to the max-min optimal decision rule.Third, we develop estimators of the bounds on the predictive performance metrics of existing risk assessment that are based on efficient influence functions and cross-fitting, and only require black-box access to the risk assessment. Our final task is to use the historical data to robustly audit or evaluate the predictive fairness properties of an existing risk assessment under the mean outcome sensitivity model.