Skip to content

Cost optimization analysis for Logging #290

@asatt

Description

@asatt

Is your feature request related to a problem? Please describe

We have a lot of problems and trash data in Observability signals.

Metrics - a lot of unused metrics, unique data in labels, histograms with a huge count of buckets, just high cardinality data
Logs - at least 90% of logs nobody reads, some services are very "noisy" and generate a lot of logs, huge log messages with a size of about 1-5 MB per message

Examples:

  • Metrics - generating about 12k metrics, but in fact using about 1k metrics
  • Logs - one component generated about 700 GB logs per day (totally generated 1.2-1.5 TB per day)

Describe the solution you'd like

Analyze and find queries to detect the most costly services that generate high cardinality data or noise, and make the service run such analysis.

It can be a separate service, or part of another service, even a playbook in Jupyter Notebook, or something else. It's discussible.
List of scenarios that we already know:

Logs:

  • logs with debug log level
  • services that generate a lot of logs
  • services that generate very long logs

Describe alternatives you've considered

No response

Additional information

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

Status

In Analysis

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions