Skip to content

Add Prometheus metrics endpoint #119

@StephanMeijer

Description

@StephanMeijer

Context

Find currently has basic health checks via Dockerflow middleware (__heartbeat__, __lbheartbeat__) and an internal self-test framework (core/selftests.py) that collects timing data for database, cache, and OpenSearch connectivity. However, there is no Prometheus-compatible metrics endpoint for scraping operational metrics.

For production observability, we need a /metrics endpoint that exposes standard application metrics in Prometheus format.

Requirements

  • Add prometheus-client library to dependencies in src/backend/pyproject.toml
  • Create a /metrics endpoint that exposes:
    • Request counts and latencies by endpoint (documents/index/, documents/search/, documents/delete/)
    • HTTP response status code distribution
    • OpenSearch query latencies
    • Document indexing throughput (count, bulk size)
    • Search query performance (query time, result counts)
  • Consider adding middleware or decorators to instrument existing views in core/views.py
  • Ensure the metrics endpoint is excluded from authentication requirements
  • Document the available metrics and scrape configuration

Technical notes

  • The existing self-test framework in core/selftests_builtin.py already collects duration_ms for each health check - these could be exposed as gauges
  • Gunicorn configuration is in docker/files/usr/local/etc/gunicorn/find.py if process-level metrics are needed
  • Consider using prometheus_client.multiprocess mode for multi-worker Gunicorn deployments

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions