misc(events-processor): Add an events reprocess pipeline#712
Merged
vincent-pochet merged 1 commit intomainfrom Mar 6, 2026
Merged
misc(events-processor): Add an events reprocess pipeline#712vincent-pochet merged 1 commit intomainfrom
vincent-pochet merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a “reprocess” mode to the events-processor pipeline so that specific replayed events can skip parts of the normal flow and only regenerate events_enriched_expanded outputs (intended to address backfills after recent enrichment bugs).
Changes:
- Introduce
source_metadata.reprocesson the raw event payload and anEvent.IsReprocess()helper. - Update
EventProcessor.processEventto bypassevents_enrichedproduction (and related side effects) whenreprocessis set, producing onlyevents_enriched_expanded. - Add a processor test covering the reprocess behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| events-processor/processors/events_processor/processor.go | Adds the reprocess fast-path that only produces enriched-expanded messages. |
| events-processor/models/event.go | Extends SourceMetadata with reprocess and adds IsReprocess(). |
| events-processor/processors/events_processor/processor_test.go | Adds coverage for the reprocess path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
63b569d to
a81bf0c
Compare
a81bf0c to
707e425
Compare
rsempe
approved these changes
Mar 6, 2026
vincent-pochet
added a commit
to getlago/lago-api
that referenced
this pull request
Mar 6, 2026
## Context This PR is related to getlago/lago#712 Two issues where recently identified in the events-processor: - Events with `timestamp` formatted as ISO 8601 string were not processed correctly and were pushed to the dead letter queue (fixed with getlago/lago#709) - Pricing group keys were not assigned correctly to the events when no filters were present on a given charge, leading to inconsistent data in the `events_enriched_expanded` kafka topic and clickhouse table (fixed with getlago/lago#710) Because of this two issues, some events will need to be reprocessed, either completely (for the timestamp issue as not `events_enriched` record were created), or partially (for the grouped_by issue as we only need to re-create the `events_enriched_expanded` record) ## Description This PR adds two rake tasks that will allow full or partial (to only produce events_enriched_expanded records) processing of events_raw. - `events:reprocess` will fetch clickhouse `events_raw` records and push them for reprocessing in the kafka topic. It takes multiple arguments: - `ORGANIZATION_ID` - An optional `SUBSCRIPTION_IDS to filter on a set of subscription - An optional `BM_CODES` to filter on a set of billable metrics - `REPROCESS` default to `true`, to only refresh the `events_enriched_expanded` - `events:deduplicate_enriched_expanded` will remove duplicated events to ensure a coherent state of the `events_enriched_expanded` table, mitigating the eventual consistency of Clickhouse
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Two issues where recently identified in the events-processor:
timestampformatted as ISO 8601 string were not processed correctly and were pushed to the dead letter queue (fixed with fix(events-processor): Handle RFC3339 timestamp #709)events_enriched_expandedkafka topic and clickhouse table (fixed with fix(events-processor): Ensure grouped by are set when matching a single flat filter #710)Because of this two issues, some events will need to be reprocessed, either completely (for the timestamp issue as not
events_enrichedrecord were created), or partially (for the grouped_by issue as we only need to re-create theevents_enriched_expandedrecord)Description
This PR check for the presence of a new
reprocessflag on theevent_rawkafka payload. When present this flag will allow to completely by-pass some part of the pipeline like theevents_enrichedkafka message producing, the subscription flagging for refresh or the cache expiration. Only anevents_enriched_explandedmessage will be produced