Healthcare Data Agent — Microsoft Fabric + Azure AI

An enterprise-grade AI chatbot that lets Health and Life Sciences (HLS) professionals query synthetic patient data using plain English. It converts natural-language questions into SQL, executes them against a Microsoft Fabric Lakehouse, and returns actionable insights — no SQL knowledge required.

Why This Matters for Health & Life Sciences

Challenge	How This App Helps
Clinical analysts need data but don't write SQL	Natural language → SQL via Fabric Data Agent
Healthcare data is sensitive and siloed	Azure AD authentication, no data leaves Fabric
Reporting cycles are slow	Real-time answers in seconds from a chat interface
Onboarding new analysts takes weeks	Self-service: just type a question and get an answer

Architecture

┌─────────────────┐     HTTPS / OAuth 2.0      ┌───────────────────────┐
│  Streamlit UI   │ ◄──────────────────────────►│  Microsoft Fabric     │
│  (Azure         │     OpenAI-compatible       │  Data Agent           │
│   Container     │     Assistants API          │  (NL → SQL engine)    │
│   Apps)         │                             │                       │
└─────────────────┘                             │  ┌─────────────────┐  │
                                                │  │ Fabric Lakehouse│  │
        Azure AD Service Principal              │  │ (Synthea data)  │  │
        or Managed Identity                     │  └─────────────────┘  │
                                                └───────────────────────┘

API flow: Create assistant → Create thread → Post question → Poll for run completion → Retrieve answer → Clean up thread

Dataset: Synthea Synthetic Healthcare

The Lakehouse contains ~150 000 records across 19 tables generated by Synthea — a widely-used open-source synthetic patient generator. No real PHI is present.

Category	Tables	Example Columns
Patient	`patients`, `allergies`, `careplans`, `immunizations`	Id, Gender, Race, BirthDate, Healthcare_Expenses
Clinical	`conditions`, `medications`, `procedures`, `observations`	Description, Code, Base_Cost, TotalCost, Value, Units
Administrative	`encounters`, `organizations`, `providers`, `payers`	EncounterClass, Total_Claim_Cost, Revenue, Speciality

Project Structure

├── src/                          # Application source
│   ├── Home.py                   # Landing page
│   ├── pages/
│   │   └── 01-Healthcare_Agent.py  # Chat interface
│   ├── services/
│   │   ├── agent_provider.py     # Azure AI Foundry integration
│   │   ├── tool_provider.py      # Fabric & Genie tool init
│   │   └── genie_functions.py    # Databricks Genie integration
│   ├── config.json               # Agent namespace config
│   ├── requirements.txt          # Python deps
│   └── env.example               # Environment variable template
├── tests/
│   ├── stress_test_healthcare_agent.py   # 50+ queries, 8 categories
│   └── quick_test.py                     # Connectivity smoke test
├── docs/
│   └── STRESS_TEST_SUMMARY.md    # Test results & methodology
├── Dockerfile                    # Container build
├── deploy-azure.ps1              # One-command Azure deployment
└── README.md

Getting Started

Prerequisites

Requirement	Details
Python	3.11 or later
Azure CLI	Installed and authenticated (`az login`)
Microsoft Fabric	Workspace with a Data Agent configured (F64+ capacity)
Azure AD App Registration	With Fabric API permissions granted

1. Clone & install

git clone https://github.com/anuraagr/enterprise-data-agents-app-fabric-api.git
cd enterprise-data-agents-app-fabric-api

python -m venv .venv
.venv\Scripts\activate        # Windows
# source .venv/bin/activate   # macOS / Linux

pip install -r src/requirements.txt

2. Configure environment

cp src/env.example src/.env

Open src/.env and fill in the required values:

Variable	Description
`FABRIC_WORKSPACE_ID`	GUID of your Fabric workspace
`FABRIC_ARTIFACT_ID`	GUID of your Data Agent artifact
`FABRIC_CLIENT_ID`	Azure AD App Registration client ID
`FABRIC_CLIENT_SECRET`	Azure AD App Registration client secret
`FABRIC_TENANT_ID`	Azure AD tenant ID

Where to find these: Open Microsoft Fabric → your workspace → Data Agent → Settings. The workspace ID and artifact ID are in the URL. The App Registration values come from Azure Portal → App registrations.

3. Run locally

cd src
streamlit run Home.py

The app opens at http://localhost:8501. Navigate to the Healthcare Agent page and start asking questions.

4. Deploy to Azure Container Apps (optional)

.\deploy-azure.ps1

This creates a resource group, ACR, and Container App with managed identity — all in one command.

Sample Queries

Once the agent is running, try these:

Question	What it does
"What tables are available?"	Schema discovery
"Show the top 10 conditions by patient count"	Condition prevalence analysis
"What is the average medication cost by drug?"	Pharmacy cost analytics
"How many encounters per encounter class?"	Utilization breakdown
"Which providers have the most patients?"	Provider workload distribution
"Show healthcare spending by gender and race"	Health equity / demographic analysis
"List patients with diabetes and their medications"	Comorbidity + treatment join

Testing

# Quick connectivity check
python tests/quick_test.py

# Run all 50+ stress-test queries (with 3-second delay between queries)
python tests/stress_test_healthcare_agent.py --all --delay 3

# Run a single category
python tests/stress_test_healthcare_agent.py --category patient_queries

See docs/STRESS_TEST_SUMMARY.md for detailed results.

Authentication

The app tries credentials in this order:

Client Secret — production deployments with an App Registration
Managed Identity — Azure Container Apps with system-assigned identity
Azure CLI — local development (az login)

For local development, having a valid az login session is the simplest path.

Troubleshooting

Symptom	Fix
"Fabric capacity not active"	Resume the capacity in the Fabric Admin Portal → Capacity settings → Resume
Authentication errors	Run `az login`, verify `.env` values, check App Registration API permissions
Query timeouts	Complex joins can take 60–120 s. The agent retries automatically with exponential backoff.
Empty `FABRIC_WORKSPACE_ID`	Make sure `src/.env` exists and is populated — the app won't fall back to hardcoded IDs.

Key Components

File	Purpose
`src/Home.py`	Landing page with stats, schema overview, and navigation
`src/pages/01-Healthcare_Agent.py`	Chat UI — handles auth, API calls, retries, upload, export
`src/services/agent_provider.py`	Azure AI Foundry async agent lifecycle
`src/services/tool_provider.py`	Initialises Fabric + Genie toolset
`src/services/genie_functions.py`	Databricks Genie NL-to-SQL bridge
`deploy-azure.ps1`	Automated Azure Container Apps deployment
`Dockerfile`	Python 3.11-slim container with health check

Acknowledgements

Synthea — synthetichealth/synthea for realistic synthetic patient data
Microsoft Fabric — Data Agent API for natural-language SQL
Azure AI Foundry — Agent management SDK
Original repo — mcaps-microsoft/enterprise-data-agents

Built for Health and Life Sciences teams exploring AI-powered data access with Microsoft Fabric.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Healthcare Data Agent — Microsoft Fabric + Azure AI

Why This Matters for Health & Life Sciences

Architecture

Dataset: Synthea Synthetic Healthcare

Project Structure

Getting Started

Prerequisites

1. Clone & install

2. Configure environment

3. Run locally

4. Deploy to Azure Container Apps (optional)

Sample Queries

Testing

Authentication

Troubleshooting

Key Components

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
docs		docs
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
deploy-azure.ps1		deploy-azure.ps1

Folders and files

Latest commit

History

Repository files navigation

Healthcare Data Agent — Microsoft Fabric + Azure AI

Why This Matters for Health & Life Sciences

Architecture

Dataset: Synthea Synthetic Healthcare

Project Structure

Getting Started

Prerequisites

1. Clone & install

2. Configure environment

3. Run locally

4. Deploy to Azure Container Apps (optional)

Sample Queries

Testing

Authentication

Troubleshooting

Key Components

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages