Audio signals are widely recognised as powerful indicators of overall health status, and there has been increasing interest in leveraging sound for affordable COVID-19 screening through machine learning. However, there has also been scepticism regarding the initial efforts, due to perhaps the lack of reproducibility, large datasets and transparency which unfortunately is often an issue with machine learning for health. To facilitate the advancement and openness of audio-based machine learning for respiratory health, we release a dataset consisting of 53,449 audio samples (over 552 hours in total) crowd-sourced from 36,116 participants through our COVID-19 Sounds app. Given its scale, this dataset is comprehensive in terms of demographics and spectrum of health conditions. It also provides participants' self-reported COVID-19 testing status with 2,106 samples tested positive. To the best of our knowledge, COVID-19 Sounds is the largest multi-modal dataset of COVID-19 respiratory sounds: it consists of three modalities including breathing, cough, and voice recordings. Additionally, in this paper, we report on several benchmarks for two principal research tasks: respiratory symptoms prediction and COVID-19 prediction. For these tasks we demonstrate performance with a ROC-AUC of over 0.7, confirming both the promise of machine learning approaches based on these types of datasets as well as the usability of our data for such tasks. We describe a realistic experimental setting that hopes to pave the way to a fair performance evaluation of future models. In addition, we reflect on how the released dataset can help to scale some existing studies and enable new research directions, which inspire and benefit a wide range of future works.