Utilizing MCP and Shared Introspection for Targeted Agent-to-Agent Communication in Hierarchical Clinical Multi-Agent Systems
Abstract
In high-stakes clinical environments, physicians must make rapid and complex decisions while contending with uncertainty, fragmented data, and cognitive overload. Traditional decision support systems and AI models often fall short due to limited explainability, adaptability, and collaboration between decision-making components. To address these limitations, we present a hierarchical multi-agent system using CrewAI that enhances reasoning through structured introspection and deliberation. Each agent in the system generates introspection logs consisting of its proposed output, supporting reasoning, confidence score, and self-reflection. These logs are stored and shared in real time via a centralized Model Context Protocol server. The architecture includes one high-level meta-agent modeled after an attending physician and five domain-specialist peer agents focused on internal medicine, pharmacology, pathology, surgery, and psychiatry. Each agent runs a local instance of the Qwen 2.5 7-billion parameter model. The meta-agent aggregates introspective logs and resolves conflicting responses using a novel hybrid scoring mechanism that accounts for both confidence and explanation depth. In cases of disagreement, the system invokes the novel High-Level Agent Communication Protocol to conduct structured, one-on- one deliberations between agents. We evaluate our framework on a subset of USMLE Step 1 and Step 2 clinical questions. Results show that shared introspection improves diagnostic accuracy from 20% to 80% and reduces average response time by approximately 28%. These findings highlight the promise of introspective multi-agent systems for developing more transparent and trustworthy AI-driven clinical decisions tools.