An enterprise-grade AI chatbot that lets Health and Life Sciences (HLS) professionals query synthetic patient data using plain English. It converts natural-language questions into SQL, executes them against a Microsoft Fabric Lakehouse, and returns actionable insights — no SQL knowledge required.
| Challenge | How This App Helps |
|---|---|
| Clinical analysts need data but don't write SQL | Natural language → SQL via Fabric Data Agent |
| Healthcare data is sensitive and siloed | Azure AD authentication, no data leaves Fabric |
| Reporting cycles are slow | Real-time answers in seconds from a chat interface |
| Onboarding new analysts takes weeks | Self-service: just type a question and get an answer |
┌─────────────────┐ HTTPS / OAuth 2.0 ┌───────────────────────┐
│ Streamlit UI │ ◄──────────────────────────►│ Microsoft Fabric │
│ (Azure │ OpenAI-compatible │ Data Agent │
│ Container │ Assistants API │ (NL → SQL engine) │
│ Apps) │ │ │
└─────────────────┘ │ ┌─────────────────┐ │
│ │ Fabric Lakehouse│ │
Azure AD Service Principal │ │ (Synthea data) │ │
or Managed Identity │ └─────────────────┘ │
└───────────────────────┘
API flow: Create assistant → Create thread → Post question → Poll for run completion → Retrieve answer → Clean up thread
The Lakehouse contains ~150 000 records across 19 tables generated by Synthea — a widely-used open-source synthetic patient generator. No real PHI is present.
| Category | Tables | Example Columns |
|---|---|---|
| Patient | patients, allergies, careplans, immunizations |
Id, Gender, Race, BirthDate, Healthcare_Expenses |
| Clinical | conditions, medications, procedures, observations |
Description, Code, Base_Cost, TotalCost, Value, Units |
| Administrative | encounters, organizations, providers, payers |
EncounterClass, Total_Claim_Cost, Revenue, Speciality |
├── src/ # Application source
│ ├── Home.py # Landing page
│ ├── pages/
│ │ └── 01-Healthcare_Agent.py # Chat interface
│ ├── services/
│ │ ├── agent_provider.py # Azure AI Foundry integration
│ │ ├── tool_provider.py # Fabric & Genie tool init
│ │ └── genie_functions.py # Databricks Genie integration
│ ├── config.json # Agent namespace config
│ ├── requirements.txt # Python deps
│ └── env.example # Environment variable template
├── tests/
│ ├── stress_test_healthcare_agent.py # 50+ queries, 8 categories
│ └── quick_test.py # Connectivity smoke test
├── docs/
│ └── STRESS_TEST_SUMMARY.md # Test results & methodology
├── Dockerfile # Container build
├── deploy-azure.ps1 # One-command Azure deployment
└── README.md
| Requirement | Details |
|---|---|
| Python | 3.11 or later |
| Azure CLI | Installed and authenticated (az login) |
| Microsoft Fabric | Workspace with a Data Agent configured (F64+ capacity) |
| Azure AD App Registration | With Fabric API permissions granted |
git clone https://github.com/anuraagr/enterprise-data-agents-app-fabric-api.git
cd enterprise-data-agents-app-fabric-api
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS / Linux
pip install -r src/requirements.txtcp src/env.example src/.envOpen src/.env and fill in the required values:
| Variable | Description |
|---|---|
FABRIC_WORKSPACE_ID |
GUID of your Fabric workspace |
FABRIC_ARTIFACT_ID |
GUID of your Data Agent artifact |
FABRIC_CLIENT_ID |
Azure AD App Registration client ID |
FABRIC_CLIENT_SECRET |
Azure AD App Registration client secret |
FABRIC_TENANT_ID |
Azure AD tenant ID |
Where to find these: Open Microsoft Fabric → your workspace → Data Agent → Settings. The workspace ID and artifact ID are in the URL. The App Registration values come from Azure Portal → App registrations.
cd src
streamlit run Home.pyThe app opens at http://localhost:8501. Navigate to the Healthcare Agent page and start asking questions.
.\deploy-azure.ps1This creates a resource group, ACR, and Container App with managed identity — all in one command.
Once the agent is running, try these:
| Question | What it does |
|---|---|
| "What tables are available?" | Schema discovery |
| "Show the top 10 conditions by patient count" | Condition prevalence analysis |
| "What is the average medication cost by drug?" | Pharmacy cost analytics |
| "How many encounters per encounter class?" | Utilization breakdown |
| "Which providers have the most patients?" | Provider workload distribution |
| "Show healthcare spending by gender and race" | Health equity / demographic analysis |
| "List patients with diabetes and their medications" | Comorbidity + treatment join |
# Quick connectivity check
python tests/quick_test.py
# Run all 50+ stress-test queries (with 3-second delay between queries)
python tests/stress_test_healthcare_agent.py --all --delay 3
# Run a single category
python tests/stress_test_healthcare_agent.py --category patient_queriesSee docs/STRESS_TEST_SUMMARY.md for detailed results.
The app tries credentials in this order:
- Client Secret — production deployments with an App Registration
- Managed Identity — Azure Container Apps with system-assigned identity
- Azure CLI — local development (
az login)
For local development, having a valid az login session is the simplest path.
| Symptom | Fix |
|---|---|
| "Fabric capacity not active" | Resume the capacity in the Fabric Admin Portal → Capacity settings → Resume |
| Authentication errors | Run az login, verify .env values, check App Registration API permissions |
| Query timeouts | Complex joins can take 60–120 s. The agent retries automatically with exponential backoff. |
Empty FABRIC_WORKSPACE_ID |
Make sure src/.env exists and is populated — the app won't fall back to hardcoded IDs. |
| File | Purpose |
|---|---|
src/Home.py |
Landing page with stats, schema overview, and navigation |
src/pages/01-Healthcare_Agent.py |
Chat UI — handles auth, API calls, retries, upload, export |
src/services/agent_provider.py |
Azure AI Foundry async agent lifecycle |
src/services/tool_provider.py |
Initialises Fabric + Genie toolset |
src/services/genie_functions.py |
Databricks Genie NL-to-SQL bridge |
deploy-azure.ps1 |
Automated Azure Container Apps deployment |
Dockerfile |
Python 3.11-slim container with health check |
- Synthea — synthetichealth/synthea for realistic synthetic patient data
- Microsoft Fabric — Data Agent API for natural-language SQL
- Azure AI Foundry — Agent management SDK
- Original repo — mcaps-microsoft/enterprise-data-agents
Built for Health and Life Sciences teams exploring AI-powered data access with Microsoft Fabric.