LOTC-1523: walk nested JSON pointers in shifter and freshness validator#197
Open
kevinborkman-hub wants to merge 2 commits into
Open
LOTC-1523: walk nested JSON pointers in shifter and freshness validator#197kevinborkman-hub wants to merge 2 commits into
kevinborkman-hub wants to merge 2 commits into
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tetime tests Two related gaps surfaced by trafficpeak/apicontext (datetime primary `startTime` sourced from single-segment `/start_time`, format `2006-01-02T15:04:05.999999Z`): 1. Pointer walking from the prior commit already covered the column-name vs raw-key mismatch (`startTime` ≠ `start_time`), but the format string was silently failing to parse — the Go-layout translator didn't recognize fractional-second tokens (`.999999`, `.000000`, etc.), so chrono/strptime tried to match the literal characters and `continue`d on parse failure. Added all six common variants (`.999999999`/`.000000000` down to `.999`/`.000`) to both the Rust and Python translators, mapped to chrono's `%.f` and Python's `.%f`. 2. Test coverage didn't include a datetime + single-segment + renamed-column shape end-to-end. Added `test_stale_renamed_datetime_warns` and `test_fresh_renamed_datetime_passes` (Rust) and `test_datetime_renamed_column_microseconds_shifted` (Python) to lock in the apicontext shape as a regression case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes LOTC-1523. Both the configurator's stale-timestamp auto-shifter and the Rust freshness validator silently skipped transforms whose primary timestamp column is sourced from a multi-segment JSON pointer (e.g.
/httpMessage/start). SIEM-shaped bundles passed validation with arbitrarily stale fixtures because both code paths looked upsample_databy post-transform output column name (a flat key) instead of walking the JSON pointer to the nested location. Uncovered by LOTC-691 (trafficpeak/siem) where the fixture's primary value sat at1491303422(2017-04-04) without anyone noticing.Changes
scripts/configurator/transform_organizer.py— replaced internal_resolve_sample_key(col, sample) -> str | Nonewith_resolve_sample_path(col, sample) -> tuple[str, ...] | None. Added_get_at_pathand_set_at_pathhelpers. Both_shift_stale_timestampsand_shift_stale_datetime_primarynow read/write through path tuples (single- or multi-segment).src/validate/sample_data_freshness.rs— addedresolve_primary_valuehelper that mirrors the Python algorithm (output_name → from_json_pointers viaserde_json::Value::pointer→ from_input_field). Primary-epoch lookup now also acceptsValue::Stringcontaining a numeric epoch (the actual SIEM fixture stores"1491303422"as a JSON string).tests/test_timestamp_freshness.py—TestResolveSampleKeyrewritten asTestResolveSamplePathfor new tuple semantics; added 4 nested-pointer tests covering numeric/string/fresh/datetime cases. Existing single-segment tests preserved as regression coverage.test_stale_nested_pointer_numeric_epoch_warns,test_stale_nested_pointer_string_epoch_warns,test_fresh_nested_pointer_epoch_passes.Resolver priority (output_name > from_json_pointers > from_input_field > null-fallback) preserved across the refactor. Backward compat for
cdn-insights/bot-insights/akamai_ds2(single-segment pointers) confirmed by existing tests still passing.Test plan
python3 -m pytest tests/— 135/135 passcargo test --bin bundle-validator— 60/60 passcargo fmtcleancargo clippy --bin bundle-validator -- -D warningsclean (4 pre-existing--all-targetsclippy errors inverify.rsandsummary_table_references.rsare unrelated)trafficpeak/siemfixture (current value1491303422) gets auto-refreshed when its.originals/re-runs the configurator pipelineOut of scope
.originals/will auto-refresh it.🤖 Generated with Claude Code