Modeling fire spread is critical in fire risk management. Creating data-driven models to forecast spread remains challenging due to the lack of comprehensive data sources that relate fires with relevant covariates. We present the first comprehensive and open-source dataset that relates historical fire data with relevant covariates such as weather, vegetation, and topography. Our dataset, named WildfireDB, contains over 17 million data points that capture how fires spread in continental USA in the last decade. In this paper, we describe the algorithmic approach used to process and integrate the data, describe the dataset, and present benchmark results regarding data-driven models that can be learned to forecast the spread of wildfires.