🧑🏽⚕️ [PUBLISHED] AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow
🎉 We present AIPatient, a novel framework that enhances medical education and research through an advanced simulated patient system built upon Electronic Health Records (EHRs) and powered by Large Language Models (LLMs).
🔍 Overview
Here, we developed AIPatient, an advanced simulated patient system with AIPatient Knowledge Graph (AIPatient KG) as the input and the Reasoning Retrieval-Augmented Generation (Reasoning RAG) agentic workflow as the generation backbone. AIPatient KG samples data from Electronic Health Records (EHRs) in the Medical Information Mart for Intensive Care (MIMIC)-III database, producing a clinically diverse and relevant cohort of 1,495 patients with high knowledgebase validity (F1 0.89). Reasoning RAG leverages six LLM powered agents spanning tasks including retrieval, KG query generation, abstraction, checker, rewrite, and summarization. This agentic framework reaches an overall accuracy of 94.15% in EHR-based medical Question Answering (QA), outperforming benchmarks that use either no agent or only partial agent integration. Our system also presents high readability (median Flesch Reading Ease 77.23; median Flesch Kincaid Grade 5.6), robustness (ANOVA F-value 0.6126, p>0.1), and stability (ANOVA F-value 0.782, p>0.1). The promising performance of the AIPatient system highlights its potential to support a wide range of applications, including medical education, model evaluation, and system integration.
⚙️ Core Methodology
The methodology of the AIPatient system is primarily structured around two key components:
1. AIPatient Knowledge Graph (AIPatient KG)
This graph serves as the foundational knowledge base, containing patient data extracted from the Medical Information Mart for Intensive Care (MIMIC-III) database. Key features include:
- Diverse Patient Cohort: 1,495 patient profiles offering clinically diverse scenarios.
- High Accuracy: F1 score of 0.89, indicating substantial accuracy in capturing medical data through Named Entity Recognition (NER).
- Structured Organization: Medical entities such as symptoms, medical histories, allergies, and vital signs organized into nodes, interlinked by edges depicting relationships (e.g.,
HAS_SYMPTOM). - LLM-based NER: Utilizes advanced NER techniques to transform unstructured clinical notes into structured data.
- Graph Database: Efficiently stored and queried using Neo4j for complex relationship analysis.
2. Reasoning Retrieval-Augmented Generation Workflow
This multi-agent system provides a dynamic method for processing natural language queries and generating context-appropriate responses across three primary stages:
Retrieval Stage
- Retrieval Agent: Selects relevant nodes and relationships from the AIPatient KG based on queries.
- KG Query Generation Agent: Formulates Cypher queries to extract pertinent information.
Reasoning Stage
- Abstraction Agent: Simplifies and rephrases user queries into more general forms.
- Checker Agent: Verifies alignment between retrieved information and user inquiries.
Generation Stage
- Rewrite Agent: Transforms KG results into natural language with simulated patient personality traits.
- Summarization Agent: Updates and maintains context for multi-turn interactions.
💻 Technical Specifications
🌟 Key Contributions
-
✓
Advanced Knowledge Graph: AIPatient KG provides a robust foundation with high validity (F1 0.89) derived from real EHR data.
-
✓
Multi-Agent Architecture: Six specialized LLM agents working in concert for comprehensive patient simulation.
-
✓
Superior Performance: 94.15% accuracy outperforming existing benchmarks.
-
✓
High Readability: Ensures accessible communication for educational purposes.
-
✓
Robustness and Stability: Consistent performance across varied scenarios and question formulations.
🚀 Applications and Impact
The AIPatient framework offers transformative potential for:
🔮 Future Directions
Broader Data Scope
Incorporating wider variety of healthcare scenarios for more extensive knowledge bases.
Multimodal Integration
Adding medical imaging and other data modalities to simulate full clinical inputs.
Real-time Clinical
Adapting environments for live clinical settings and interactive deployment.
Enhanced Training
Further enriching educational landscapes and outcomes for healthcare professionals.
📝 Conclusion
The AIPatient framework exemplifies a transformative approach to patient simulation through EHR data and LLM-powered interactions, emphasizing structured retrieval and knowledge dissemination in a manner conducive to enhancing medical training and decision-making processes. By embedding these systems within medical curricula and research protocols, we propose significant advancements in how interactions with simulated patients can impact educational outcomes and clinical preparedness.
📖 Welcome to Cite Our Article
If you find our work helpful, please consider citing it: