DataGuard is a SQL Server data validation console for finding broken data before it becomes business truth. It combines a Streamlit operations UI, a Python service layer, T-SQL validation functions and procedures, structured validation logs, and run history tables so teams can define rules, execute checks, inspect failures, and trace validation activity from run to log.
- Japandi Dark Streamlit console with a restrained operational UI, shared design system, focused navigation, and keyboard-visible focus states.
- Six-page validation workflow covering Welcome, Health Dashboard, Run Validations, Results & History, Rule Manager, and Operational Logs.
- Three-step validation wizard for selecting table scope, choosing rule execution posture, and confirming a validation run.
- Dynamic SQL validation execution through SQL Server procedures and
dbo.vf_*scalar validation functions. - Structured validation logs stored in
validation_logswith severity, validation ID, correlation ID, status, payload JSON, exception details, and indexed filtering fields. - Live operational log console with severity filters, minimum severity mode, search, grouped display modes, event details, and copy/filter actions.
- Per-rule result tracking in
validation_rule_results, linked to validation run history for pass/fail rates and duration analysis. - Rule Manager for adding manual rules, accepting suggestions, and reviewing active rulesets against the connected database.
- SQL Server connection picker with ODBC driver discovery, Windows authentication, SQL authentication, and database selection.
- Separated project layers for pages, reusable components, services, SQL migrations/procedures/functions, utilities, tests, and local setup docs.
No formal releases have been published yet. Clone main for the latest code:
git clone https://github.com/LSZTM/B1_Training_Project.git
cd B1_Training_Projectfrom utils.db import get_available_drivers, test_connection
# Check which SQL Server ODBC drivers are installed.
drivers = get_available_drivers()
print(drivers)
# ['{ODBC Driver 17 for SQL Server}', '{ODBC Driver 18 for SQL Server}']
result = test_connection()
print(result)
# {'success': True, 'driver': '{ODBC Driver 17 for SQL Server}'}streamlit run main.pyDataGuard opens a SQL Server connection screen when no active database is selected. After connection, the app launches into the Welcome page and exposes the full validation console through the sidebar.
Validation failures are recorded in run history, rule result tables, error logs, and structured operational logs. A failed validation event includes fields such as severity, validation ID, correlation ID, rule code, validation status, duration, payload JSON, and stack trace when available:
severity: ERROR
event_type: validation.failed
validation_status: FAILED
message: Validation failed for Customers.Email using IsEmail.
correlation_id: 03A0C1A5-BDEF-4913-9F22-2AF91B68745A
Use the Operational Logs page to filter by severity, minimum severity, validation ID, rule ID, status, time range, and free-text search.
| Rule | Description | Example |
|---|---|---|
vf_NOT_NULL |
Column value must not be NULL | FirstName IS NOT NULL |
vf_trimmed |
Text must not contain leading or trailing whitespace | Name = LTRIM(RTRIM(Name)) |
vf_IsEmail |
Text must be a valid email shape | Email LIKE '%@%.%' |
vf_HasLength |
String length must stay within the configured limit | LEN(Code) <= 50 |
vf_age_range |
Numeric age must fall within the allowed range | Age BETWEEN 0 AND 150 |
vf_date_not_in_future |
Date value must not be later than the current date | BirthDate <= GETDATE() |
Rules are configured through
temp_validation_configand executed by SQL Server procedures that call matchingdbo.vf_*validation functions.
| Function | Returns | Description |
|---|---|---|
test_connection() |
dict |
Attempts a DB handshake and returns connection status details |
get_connection() |
pyodbc.Connection |
Returns an open SQL Server connection for the selected database |
get_available_drivers() |
list[str] |
Lists installed SQL Server-compatible ODBC drivers |
discover_server_connection() |
str |
Builds a server-level connection string for database discovery |
list_databases() |
list[str] |
Lists user databases available on the selected SQL Server instance |
switch_database() |
None |
Updates the active database connection for the Streamlit session |
| Function | Returns | Description |
|---|---|---|
get_available_drivers() |
list[str] |
Filters installed ODBC drivers to SQL Server-compatible options |
| Page | Route | Description |
|---|---|---|
| Welcome | pages/1_Welcome.py |
Product entry point, connection context, and next-step actions |
| Health Dashboard | pages/2_Dashboard.py |
Validation health, records scanned, failures, and quality charts |
| Run Validations | pages/3_Execute.py |
Three-step wizard for selecting scope, options, and execution |
| Results & History | pages/4_History.py |
Run log, error explorer, and quality trend review |
| Rule Manager | pages/5_Rules.py |
Add suggested/manual rules and inspect the active ruleset |
| Operational Logs | pages/6_Logs.py |
Live structured log stream with filters and event drill-down |
- Fork the repository on GitHub.
- Create a branch for your change:
git checkout -b feat/your-feature-name
- Make your changes, adding tests under
tests/where applicable. - Open a Pull Request against
mainwith a clear description of what changed and why.
Please keep SQL scalar functions prefixed dbo.vf_*, validation orchestration in services/validation_service.py, and structured log behavior in services/logs_service.py so UI, API, and SQL behavior remain aligned.
Active development. Core validation execution, rule management, structured logging, run history, per-rule result tracking, and the Streamlit console are functional. Current focus areas include hardening SQL migrations, expanding rule coverage, and improving local demo reliability.
This project is licensed under the MIT License. See LICENSE for details.
Open an issue on GitHub for bug reports or feature requests. For DB connectivity problems, confirm that ODBC Driver 17 or 18 for SQL Server is installed, SQL Server is running, and the selected database is reachable from your machine.