Optimal Self-Consistency for Efficient Reasoning with Large Language Models
Austin Feng · Marius Alonso · Ambroise Odonnat · Vasilii Feofanov · Ievgen Redko
Abstract
Self-consistency (SC) is one of the most popular test-time inference techniques for augmenting performance in chain-of-thought reasoning. It consists of generating multiple responses, or ``samples", from a large language model (LLM) and selecting the most frequent answer, which can be viewed as an application of majority vote and mode estimation. Despite its effectiveness, self-consistency is prohibitively expensive at scale when naively applied to datasets. By leveraging the mode estimation and voting theory, we design Blend-ASC, a novel variant of self-consistency that dynamically allocates samples, achieving state-of-the-art sample efficiency. We show that our approach uses $6.8\times$ fewer samples on average compared to adaptive and fixed-allocation self-consistency baselines, demonstrating the superiority of our approach in terms of efficiency. We note that Blend-ASC is not only lightweight but also hyperparameter-free, ensuring it can be easily applied to any self-consistency applications. Finally, we derive novel scaling laws, offering a way to predict sample efficiency for a given target error.
Chat is not available.
Successful Page Load