Reasoning Models Reason Inefficiently
Dipika Khullar · Ashwinee Panda
Abstract
Large language models (LLMs) produce long, structured reasoning traces that can inflate latency and cost. Our results suggest that while backtracking can help models arrive to the correct answer, they are not a faithful picture of the minimal computation required to solve a taskâthey can be compressed or restructured. In this paper, we show how to build more efficient and interpretable reasoning processes by identifying and targeting internal directions associated with inefficiency.
Chat is not available.
Successful Page Load