Hierarchical Deep Research with Local–Web RAG: Toward Automated System-Level Materials Discovery
Abstract
We present a long-horizon, hierarchical deep research (DR) agent designed for complex materials and device discovery problems that exceed the scope of existing ML surrogates or closed-source commercial agents. Our framework instantiates a locally deployable DR instance that integrates local retrieval-augmented generation (RAG) with large language model (LLM) reasoners, enhanced by a Deep Tree of Research (DToR) mechanism that adaptively expands and prunes research branches to maximize coverage, depth, and coherence. We evaluate across 21 nanomaterials/device topics using an LLM-as-judge rubric with 5 web-enabled SOTA models as jurors. In addition, we conduct dry-lab validations on five representative tasks, where human experts use domain simulations (e.g., DFT) to verify whether DR-agent proposals are actionable. Results show that our DR agent produces reports with quality comparable to commercial systems (ChatGPT-o3/o4-mini-high, Gemini Deep Research) at substantially lower cost, while enabling on-prem integration with local data and tools.