Skip to content

# feat: implement security monitoring and operational metrics dashboard#677

Open
saidai-bhuvanesh wants to merge 8 commits into
adithyan-css:mainfrom
saidai-bhuvanesh:feature/security-monitoring-metrics
Open

# feat: implement security monitoring and operational metrics dashboard#677
saidai-bhuvanesh wants to merge 8 commits into
adithyan-css:mainfrom
saidai-bhuvanesh:feature/security-monitoring-metrics

Conversation

@saidai-bhuvanesh
Copy link
Copy Markdown

Summary

This PR introduces a production-grade security monitoring and operational metrics system for Brownie-Bliss. The implementation provides centralized observability, security event tracking, API performance monitoring, and operational health reporting to help administrators detect abuse, investigate incidents, and monitor system performance in real time.

Problem Statement

Prior to this implementation:

  • Failed authentication attempts were not tracked centrally.
  • Rate-limit violations had no visibility.
  • OTP abuse attempts could not be monitored.
  • API latency metrics were unavailable.
  • Operational failures lacked centralized telemetry.
  • Administrators had no monitoring dashboard for system health and security events.

Changes Implemented

Security Event Tracking

Introduced a dedicated SecurityEvent model to capture and retain critical security-related activities:

  • Failed admin logins
  • Rate-limit violations
  • OTP cooldown violations
  • OTP verification exhaustion
  • Suspicious order attempts
  • Database failures
  • System exceptions

Security events are automatically retained for 90 days using TTL indexes and cleaned up without manual intervention.

API Performance Monitoring

Introduced an ApiMetric model to collect:

  • API response times
  • Route execution statistics
  • Request frequency metrics
  • Performance aggregation data

Metrics automatically expire after 24 hours to prevent database growth.

Centralized Metrics Service

Created a reusable metricsService.js responsible for:

  • Non-blocking telemetry collection
  • Event aggregation
  • Performance metric storage
  • Dashboard data generation

The service operates asynchronously to avoid impacting user-facing requests.

Global Monitoring Middleware

Implemented monitoringMiddleware.js which:

  • Tracks request execution times using high-resolution timers
  • Aggregates metrics by route pattern
  • Prevents path parameter pollution
  • Records API performance statistics with minimal overhead

Security Instrumentation

Added monitoring hooks across critical workflows:

Authentication

  • Failed admin login tracking
  • Authentication anomaly detection

OTP Security

  • Cooldown violations
  • Verification exhaustion
  • Abuse detection

Order Processing

  • Duplicate order attempts
  • Suspicious high-value orders
  • Checkout anomalies

Infrastructure

  • Database failures
  • Unhandled exceptions
  • Runtime system errors

Admin Monitoring Dashboard

Added a protected endpoint:

GET /api/admin/monitoring/dashboard

Provides:

  • Security event summaries
  • API latency statistics
  • System uptime
  • Memory utilization
  • Database health
  • Recent security incidents
  • Operational metrics overview

Access is restricted through existing admin authentication middleware.

Files Added

Models

  • SecurityEvent.js
  • ApiMetric.js

Services

  • metricsService.js

Middleware

  • monitoringMiddleware.js

Controllers

  • monitoringController.js

Files Updated

Authentication

  • adminController.js

Rate Limiting

  • rateLimiters.js

OTP Workflows

  • otpController.js

Order Processing

  • orderController.js

Email Reliability

  • mailer.js

Routing

  • adminRoutes.js

Application Bootstrap

  • index.js

Testing

Automated Coverage

Added integration tests covering:

  • Security event creation
  • Metrics collection
  • Failed login telemetry
  • Rate-limit monitoring
  • Dashboard endpoint access
  • Dashboard authorization protection

Results

  • 32/32 tests passing
  • No regressions detected
  • Backward compatibility maintained

Benefits

Security

  • Improved attack visibility
  • Better incident investigation
  • Abuse detection capabilities
  • Centralized security auditing

Reliability

  • Faster root-cause analysis
  • Improved debugging workflows
  • Operational health monitoring
  • Production observability

Performance

  • API latency visibility
  • Route-level performance metrics
  • Low-overhead telemetry collection

Impact

This implementation significantly improves operational awareness, security visibility, and production monitoring capabilities while maintaining low performance overhead and full backward compatibility.

@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 5, 2026

Someone is attempting to deploy a commit to the adithyansubramani1-1657's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant