Home

observability-dashboard

The central nervous system of the Alex-CloudOps observability ecosystem — aggregating telemetry, uptime, incident, and log intelligence data from across four production-grade repositories into a single unified health dashboard with Power BI ready exports.

What Is This?

Every serious CloudOps operation needs a single pane of glass — one place where the health of the entire infrastructure is visible, scored, and actionable.

observability-dashboard is that place.

It pulls live data from all four portfolio repositories, transforms and normalizes each data source, calculates component and ecosystem-wide health scores, and exports the results in three formats — including a structured Power BI dataset ready for executive-level visualization.

This is where the entire portfolio converges. This is the view a CloudOps engineer, SRE, or NOC manager actually needs.

Wiki Navigation

Page	Description
Architecture	System design, data flow, and component breakdown
Setup & Configuration	Environment setup and configuration reference
Usage Guide	Running the dashboard and interpreting outputs
Runbook	NOC-style procedures for ecosystem health response
Troubleshooting	Common issues, error messages, and fixes
Roadmap	Planned features including full Power BI integration

Quick Stats

Property	Detail
Language	Python 3.x
Data Sources	4 portfolio repositories
Health Levels	HEALTHY, DEGRADED, CRITICAL
Export Formats	JSON summary, CSV, Power BI dataset
Cloud Provider	AWS (Free Tier compatible)

Ecosystem Data Sources

Repository	Data Type	Metrics
`cloud-telemetry-agent`	Infrastructure telemetry	CPU, memory, disk
`synthetic-uptime-monitor`	Endpoint uptime	Availability, response times
`incident-alert-pipeline`	Incident records	Severity counts, open/closed
`log-intelligence-engine`	Log intelligence	Error rates, health status

At a Glance

cloud-telemetry-agent    ──┐
synthetic-uptime-monitor ──┤
incident-alert-pipeline  ──┤──▶ aggregator ──▶ transformer ──▶ summary ──▶ exporter
log-intelligence-engine  ──┘                                                   │
                                                                                ▼
                                                               dashboard_summary.json
                                                               dashboard_export.csv
                                                               powerbi_dataset.json

Sample Dashboard Output

============================================================
  🚨 ECOSYSTEM HEALTH: CRITICAL
============================================================

  📊 Component Health:
    Telemetry    : ✅ HEALTHY
    Uptime       : ⚠️ DEGRADED
    Log Intel    : ✅ HEALTHY

  📈 Key Metrics:
    CPU          : 8.5%
    Memory       : 87.0%
    Disk         : 17.9%
    Uptime       : 80.0%
    Avg Response : 353.9ms
    Log Errors   : 0.17%

  🚨 Incidents:
    Open         : 5
    Critical     : 3
    High         : 1
    Medium       : 1
    Total        : 5
============================================================

Why a Unified Dashboard?

Individual monitoring tools tell you what's wrong with one thing. A unified dashboard tells you what's wrong with everything — and gives you the context to understand why.

A CPU spike means more when you can see it alongside an open CRITICAL incident
A degraded uptime score means more when log error rates are climbing simultaneously
An executive doesn't need raw metrics — they need a health score and a trend

observability-dashboard provides all three.

The crown jewel of the Alex-CloudOps observability portfolio — where four production-grade repositories converge into a single pane of glass.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

observability-dashboard

What Is This?

Wiki Navigation

Quick Stats

Ecosystem Data Sources

At a Glance

Sample Dashboard Output

Why a Unified Dashboard?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally