Heated debates continue over the best solution for autonomous driving. The classic modular pipeline is widely adopted in the industry owing to its great interpretability and stability, whereas the fully end-to-end paradigm has demonstrated considerable simplicity and learnability along with the rise of deep learning. As a way of marrying the advantages of both approaches, learning a semantically meaningful representation and then using it in the downstream driving policy learning tasks provides a viable and attractive solution. However, several key challenges remain to be addressed, including identifying the most effective representation, alleviating the sim-to-real generalization issue as well as balancing model training cost. In this study, we propose a versatile and efficient reinforcement learning approach and build a fully functional autonomous vehicle for real-world validation. Our method shows great generalizability to various complicated real-world scenarios and superior training efficiency against the competing baselines.