Surgical phase recognition from endoscopic video could enable numerous context-aware technologies that impact efficiency and performance of surgeons and minimally invasive care teams. Surgical phases can vary greatly (from seconds to minutes) due to patient factors and surgeon workflows along with many other reasons. However, the performance of activity recognition models on the varying statistics of surgical phase durations is poorly understood and tested. To address this problem, we ensemble neural networks and other machine learning models with different architectures and temporal parameters to recognize surgical phases. The probability estimates of the ensemble per second of an entire case are then used as a sequence of observations for forward-backward smoothing to generate posterior beliefs of surgical phases over time. We demonstrate the performance of this modeling process on three data sets: 1) robot-assisted inguinal hernia (five phases), 2) robot-assisted training in a porcine model (seven phases), and 3) laparoscopic cholecystectomy (Cholec80, seven phases). The results suggest that this novel method to address varying phases in different procedures holds promise for the future of surgical phase recognition.