Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Machine Learning for Systems

Predicting LLM Inference Latency: A Roofline-Driven ML Method

Saki Imai ⋅ Rina Nakazawa ⋅ Marcelo Amaral ⋅ Sunyanan Choochotkaew ⋅ Tatsuhiro Chiba

Abstract

Chat is not available.