Skip to yearly menu bar Skip to main content

Poster - Recorded Presentation
Workshop: Machine Learning for Systems

Multi-objective Reinforcement Learning with Adaptive Pareto Reset for Prefix Adder Design

Jialin Song · Rajarshi Roy · Jonathan Raiman · Robert Kirby · Neel Kant · Saad Godil · Bryan Catanzaro


Many hardware design problems require navigating a combinatorial search space to find solutions that balance multiple conflicting objectives, e.g., area and delay. While traditional approaches rely on hand-tuned heuristics to combat the large search space, reinforcement learning (RL) has recently achieved promising results, effectively reducing the need for human expertise. However, the existing RL method has prohibitively high sample complexity requirements.In this paper, we present a novel multi-objective reinforcement learning algorithm for combinatorial optimization and apply it to automating designs for prefix adder circuits, which are fundamental to high-performance digital components. We propose to track the evolving Pareto frontier to adaptively select reset states for an episodic RL agent. Our proposed reset algorithm balances exploiting the best-discovered states so far and exploring nearby states to escape local optima. Through empirical evaluations with a real-world physical synthesis workflow on two different design tasks, we demonstrate that our new algorithm trains agents to expand the Pareto frontier faster compared to other baselines. In particular, our algorithm achieves comparable quality results with only 20% of the samples compared to the scalarized baseline. Additional ablation studies confirm that both exploration and exploitation components work together to accelerate the Pareto frontier expansion.

Chat is not available.