I specialize in building systems that don't just workβthey scale, reason, and recover. My passion lies in the intersection of High-Performance APIs and Autonomous Observability. I thrive when designing complex backend architectures that transform raw telemetry into actionable intelligence. I do like ui as well, so you will see a very consistent ui theme I go for :)
- Distributed Systems: Designing for resilience, consistency, and low-latency.
- Observability Stack: Deep integration with the LGTM stack (Loki, Grafana, Tempo, Mimir).
- AI-Native Engineering: Building reasoning engines for automated Root Cause Analysis (RCA).
- Infrastructure as Code: Orchestrating scalable, cloud-native environments.
I am currently developing a suite of interconnected observability tools designed to eliminate the friction in modern SRE workflows.
π BeObservant
The Control Plane. A unified platform for metrics, logs, traces, and alerts. It acts as the "Single Pane of Glass" for distributed systems, enforcing RBAC and multi-tenancy across the entire LGTM stack.
π§ BeCertain
The Analyst. A Python-based reasoning engine that processes telemetry data to provide automated Root Cause Analysis (RCA), anomaly detection, and predictive forecasting.
π BeNotified
The Messenger. An intelligent alerting and incident orchestration service. It manages the lifecycle of an alertβfrom the moment a threshold is crossed in Mimir to the final resolution note in Jira.
| Category | Technologies |
|---|---|
| Languages | Python (FastAPI/Flask), TypeScript/JS, C++ |
| Data & Storage | PostgreSQL, Redis, VictoriaMetrics, Mimir, Loki |
| Infrastructure | Docker, Kubernetes, OpenTelemetry (OTLP), Envoy, NGINX |
| Observability | Prometheus, Grafana, Tempo, Alertmanager |
| DevOps | CI/CD Pipelines, Vault, Keycloak (OIDC) |
- CodeMasterPro: A specialized developer tooling platform designed to streamline the local development environment and improve engineering velocity. I did explore the ralph-wiggum principal a while ago, have a look at the code
- Observability over Monitoring: Don't just watch the dashboard; Understand it
- Clean Architecture: Write code that your future self, I know I am not perfect but I want to be :) and I am continously learning to make clean and maintable code
Iβm always open to discussing distributed systems, backend performance, observability or even AI stuff. Don't talk to be about Linux, I always doing that on a daily-basis at my occupation, I wouldn't say I am a Linux expert, but I definitely know a lot about it
- LinkedIn: stefan-kumarasinghe
- Watch me build: YouTube Channel