We present LoopStack, a domain-specific compiler stack for tensor operations, composed of a front-end, LoopTool, and an efficient optimizing code generator, LoopNest. LoopStack is designed to produce highly efficient but also predictable code. Such a design allows both experts, and more importantly, ML--based approaches to find good schedules (algorithms). LoopStack is extensible and supports various processors and accelerators while incorporating HPC optimizations often missing from other machine learning compiler back-ends. To show the quality of the generated code we designed a rudimentary AI to search for schedules and compare the speed of generated code with the most optimized, hand-tuned libraries. Further, we show that for a large collection of schedules LoopNest's compilation is orders of magnitude faster than LLVM, while resulting in equal or improved run time performance.