This demonstration showcases our novel innovations on self-supervised monocular depth estimation. First, we enhance self-supervised monocular depth estimation with semantic information during training. This reduces the error by 12% and achieves state-of-the-art performance. Second, we enhance the backbone architecture using a scalable method for neural architecture search which optimizes directly for inference latency on a target device. This enables operation at > 30 FPS. We demonstrate these techniques on a smartphone powered by a Snapdragon® Mobile Platform.