Skip to content

Harden DSL execution reliability: lock down show path, unify abort handling, isolate thread-local state#3

Open
kmdvs wants to merge 47 commits into
cleanup-historyfrom
harden-dsl-reliability
Open

Harden DSL execution reliability: lock down show path, unify abort handling, isolate thread-local state#3
kmdvs wants to merge 47 commits into
cleanup-historyfrom
harden-dsl-reliability

Conversation

@kmdvs
Copy link
Copy Markdown

@kmdvs kmdvs commented Mar 16, 2026

Summary

This PR hardens DSL execution reliability and safety in three focused commits, prioritized for production stability in a multi-threaded Rails/Sidekiq environment.

Changes

0257f34 - Harden statement show path and extract webpage refresh orchestration

  • StatementsController#show no longer executes DSL during page load (removes unauthenticated request-time scraping/eval path).
  • Refresh orchestration extracted into a service used by both controller and job.
  • Background jobs no longer instantiate controllers.

c4e085d - Unify abort-update helper semantics and fix non-trace DSL result handling

  • Unified abort_update contract handling in helper paths.
  • Fixed non-trace bug where data, = run_dsl(...) truncated multi-value DSL outputs.
  • Added regression coverage for Wringer abort handling and refresh error propagation.

b5b7677 - Harden DSL runner control flow and isolate thread-local state

  • Restored deterministic short-circuit behavior for if_xpath / unless_xpath.
  • Added SyntaxError handling so malformed DSL returns abort_update instead of raising.
  • Added ensure-based thread-local restore/cleanup to prevent DSL state leakage across requests and Sidekiq jobs.
  • Aligned trace/non-trace runner/tracer contract behavior.
  • Thread-local usage is retained for compatibility with the current DSL model; cleanup prevents cross-execution leakage.

Why this split

Each commit is atomic and reviewable by concern:

  • Web/job boundary and execution surface hardening
  • Helper contract reliability
  • DSL runtime engine safety and concurrency behavior

Test Coverage

  • Added/updated controller tests for statement show trace behavior.
  • Added/updated job + service tests for refresh orchestration delegation.
  • Added helper tests for abort handling and non-trace data preservation.
  • Added DSL runner tests for syntax error aborts, short-circuit semantics, and thread-local restoration.

Risk / Compatibility Notes

  • StatementsController#show no longer executes runtime DSL tracing.
  • Refresh behavior is functionally preserved, but orchestration now runs through a service object.
  • DSL engine behavior is made more deterministic under failure and concurrent execution.

Follow-up (out of scope for this PR)

  • Add bounded DSL execution budgets (step count, runtime, output size).
  • Add queue backpressure controls for large webpage fanouts.
  • Move toward a dedicated DSL execution engine boundary.
  • Reduce reliance on Ruby eval in transformation steps.

Reviewer Checklist

  • Confirm no DSL execution from GET /statements/:id.
  • Confirm jobs no longer instantiate controllers.
  • Confirm abort_update handling is consistent across helper/runner paths.
  • Confirm thread-local DSL state is cleared/restored on all execution paths.
  • Confirm regression tests cover the reliability fixes.

@kmdvs kmdvs requested a review from saumier March 16, 2026 23:49
@kmdvs kmdvs added the enhancement New feature or request label Mar 16, 2026
@kmdvs
Copy link
Copy Markdown
Author

kmdvs commented Mar 17, 2026

Follow-up commit pushed:

Fix DSL regression semantics and stabilize test reliability.

This addresses failing tests uncovered after the initial PR push.

kmdvs2 added 3 commits March 17, 2026 03:43
This extends the DSL hardening work with a full pipeline evaluation and diagnosis system.

Includes:

- PipelineEvaluator: deterministic evaluation from persisted statements (no DSL rerun)
- PipelineInterpreter: stable metrics (facts layer)
- PipelineDiagnosis: rule-based diagnosis with Wringer signal integration
- SemanticInterpreter: step-level semantic support

Trace improvements:
- Enhanced DslTraceCollector and TraceFormatter
- Step-level visibility (input/output/error)
- Improved debugging for DSL scripts

Wringer integration:
- WringerClient introduced
- Signals injected into diagnosis (no network coupling in diagnosis layer)

API:
- /events/:id/pipeline_health endpoint
- Returns status, category, message, suggested_action, metrics, details

UI:
- Updated statements view to support trace inspection

Tests:
- Full TDD coverage for interpreter, diagnosis, evaluator, controller
- Priority rules enforced (error > wringer > data_loss > navigation > extraction)

Notes:
- Steps remain internal (not exposed in API)
- Wringer is contextual (not part of pipeline execution)
@kmdvs
Copy link
Copy Markdown
Author

kmdvs commented Mar 22, 2026

@codex make a deep review of the latest changes.

You are a senior Ruby on Rails engineer reviewing a DSL pipeline observability system.

Focus ONLY on correctness, architecture, and edge cases. Do NOT comment on style.

This PR introduces:

  • PipelineEvaluator (orchestration layer)
  • PipelineInterpreter (metrics / facts)
  • PipelineDiagnosis (rule-based diagnosis with Wringer integration)
  • DSL tracing improvements
  • WringerClient for fetch metadata

Review the diff with focus on:

  1. Wringer integration
  • Are Wringer signals correctly captured, normalized, and propagated?
  • Do WringerClient outputs match PipelineDiagnosis expectations?
  • Can wringer_failure? ever fail to trigger due to mismatched fields?
  1. Layering and architecture
  • Is PipelineEvaluator pure orchestration?
  • Does PipelineInterpreter depend only on steps?
  • Does PipelineDiagnosis depend only on metrics + wringer?
  • Any leakage or coupling between layers?
  1. Edge cases
  • Empty pipelines (no steps)
  • Extraction after navigation only
  • Empty extraction followed by error
  • Nil or malformed statement.cache
  • Multiple navigation steps
  • Recovery after failure
  1. Trace vs diagnosis consistency
  • Does trace accurately reflect what metrics detect?
  • Any mismatch between failure_step and actual root cause?
  • Is wringer consistent between trace and diagnosis?
  • Any unsafe or inconsistent trace output?
  1. Thread safety
  • Any leakage of thread-local state?
  • Are ensure blocks correctly restoring state?
  1. Identify concrete bugs or incorrect behavior
  • Not hypothetical — point to specific code paths and failure scenarios

Do NOT summarize the code.
List only real issues, risks, or inconsistencies.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e572aa6d44

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread app/helpers/cc_wringer_helper.rb Outdated
Comment on lines +124 to +125
else
{ body: resp.to_s, http_code: 200, final_url: nil }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve boolean results in safe_wringer_call

The new safe_wringer_call stringifies any non-HTTP return value via resp.to_s and returns that string. wringer_received_404? passes a boolean-returning block through this helper, so the normal false path becomes "false", which is truthy and makes !!result return true. In RefreshWebpageJob#perform, that causes successful non-404 lookups to be treated as stored 404s and the webpage gets destroyed after refresh.

Useful? React with 👍 / 👎.

Comment thread app/services/dsl/wringer_client.rb Outdated
Comment on lines +14 to +15
body = @safe_wringer_call.call do
@agent.get_file(@use_wringer.call(url, render_js, scrape_options))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fetch Wringer metadata from a response object, not get_file

WringerClient still calls @agent.get_file(...), which only yields the response body. In safe_wringer_call, those results go through the non-response branch and are normalized as http_code: 200, final_url: nil, so every rule in config/wringer.yml that depends on the real status or redirect target (redirect_to_listing, http_404, http_5xx) is unreachable for DSL fetches. A listing redirect or real 404 therefore looks like a normal HTML page and never propagates Wringer status.

Useful? React with 👍 / 👎.

Comment on lines 162 to +166
if data.is_a?(Array) && data.first == "abort_update"
info = data.second || {}
stat.errors.add(:base, "Scrape aborted (#{info[:error_type]}): #{info[:error]}")
return
abort_error_message = "Scrape aborted (#{info[:error_type]}): #{info[:error]}"
stat.errors.add(:base, abort_error_message)
error_messages << abort_error_message
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop processing datatype formatting after abort_update

This branch no longer returns after an abort_update. For xsd:anyURI statements, the method now falls through into format_datatype, which iterates the abort payload array and tries to reconcile 'abort_update' (and even the error hash) as if they were scraped URLs. On a Wringer failure for ticket/location links, that produces bogus KG lookups and can overwrite the cache with corrupted URI data instead of preserving the fetch failure.

Useful? React with 👍 / 👎.

Comment thread app/services/dsl/pipeline_evaluator.rb Outdated
def evaluate
steps = pipeline_steps
metrics = Dsl::PipelineInterpreter.new(steps).metrics
diagnosis = Dsl::PipelineDiagnosis.new(metrics: metrics, wringer: DEFAULT_WRINGER).result
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Feed actual Wringer signals into pipeline diagnosis

PipelineEvaluator always instantiates PipelineDiagnosis with DEFAULT_WRINGER, even though the runner and trace now capture Wringer status as { error_type, retry, cache }. That means /events/:id/pipeline_health can never emit :wringer_failure for Cloudflare/queue/404 cases; those signals are dropped before diagnosis and the endpoint falls back to a generic :error or even :healthy result based only on statement cache.

Useful? React with 👍 / 👎.

Comment thread app/services/dsl/pipeline_evaluator.rb Outdated
Comment on lines +35 to +36
type: infer_step_type(statement.source&.algorithm_value),
output: normalize_output(statement.cache),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Expand evaluator input to real DSL steps before interpreting

Each Statement is collapsed into one synthetic step whose type is just the first prefix of the entire DSL string. For any real algorithm like xpath=...;url=...;xpath=..., the interpreter only ever sees xpath, so has_navigation, suspicious_navigation, and failure_step cannot represent later navigation or extraction failures. The new pipeline health endpoint is therefore systematically misleading for multi-step statements, which are the common pipelines this feature is meant to explain.

Useful? React with 👍 / 👎.

kmdvs2 added 2 commits March 24, 2026 18:23
…gration

- unify DSL execution, tracing, and pipeline evaluation layers
- add pipeline diagnosis and interpreter improvements
- enhance wringer client with richer metadata (redirects, signals)
- stabilize DSL runner semantics and error handling
- improve trace formatter with severity-aware output
- align tests with DSL contract and reliability guarantees

Establishes a consistent execution model across DSL runner, trace collection,
pipeline evaluation, and wringer integration.

Foundation for improved debugging and observability.
…ive/trace)

- remove Fast mode (incomplete and confusing)
- rename Mode → View and add explicit Trace status
- unify semantic styling across all views (state-based CSS)
- make wringer metadata always visible (redirects, links, probe)
- align Debug with Interactive (expanded) while reducing noise
- remove raw trace from Interactive (separation of concerns)
- improve condition labeling ([If found])
- add color-coded semantic states (ok/warning/error)
- improve readability with consistent indentation and layout

Result:
- Debug = full pipeline overview (flat, readable)
- Interactive = step-by-step debugger
- Trace = technical execution log
- Clean = DSL only

Transforms trace viewer into a coherent debugging interface.
@kmdvs
Copy link
Copy Markdown
Author

kmdvs commented Mar 24, 2026

@codex You are a very senior Ruby on Rails engineer reviewing this branch after a substantial stacked change set.

Please perform a deep code review of the current branch, with special attention to regressions, architectural coherence, and hidden edge cases.

Context:
This branch extends earlier DSL reliability work and now includes:

  • DSL runner hardening
  • pipeline evaluation / diagnosis / interpreter changes
  • Wringer metadata propagation (redirects, signals)
  • trace formatter updates
  • statement trace/debug UI overhaul
  • options UI/view mode cleanup
  • new and updated tests

Review goals:

  1. Correctness
  • Look for behavior regressions, nil risks, incorrect assumptions, broken conditionals, bad fallbacks, inconsistent data shapes, and view/helper mismatches.
  • Pay special attention to trace-enabled vs trace-disabled behavior.
  • Check whether Debug / Interactive / Trace / Clean modes are internally consistent.
  1. Architecture
  • Verify that responsibilities remain clean:
    • Wringer provides facts
    • Condenser interprets/diagnoses
    • UI presents meaning
  • Identify any leakage of interpretation into lower layers.
  • Call out duplication or logic that should be consolidated.
  1. UI / helper coherence
  • Review statements_helper + statements/show integration carefully.
  • Check whether view logic is too complex or brittle.
  • Verify semantic styling is consistent and view-independent.
  • Look for any remaining inconsistencies between Debug and Interactive.
  1. Tests
  • Identify missing tests for edge cases.
  • Call out places where current tests may give false confidence.
  • Suggest the highest-value additional test cases.
  1. Safety / reliability
  • Review redirect handling, wringer metadata usage, trace compaction/degradation, and DSL failure propagation.
  • Look for cases where misleading diagnostics could still appear.

Output format:

  • First: top 5 most important findings, ordered by severity.
  • Then: medium-priority findings.
  • Then: low-priority cleanup suggestions.
  • For each finding, include:
    • file/path
    • why it matters
    • suggested fix
  • Distinguish clearly between:
    • real bugs / regressions
    • design concerns
    • optional cleanup

Important:

  • Be critical and precise.
  • Do not praise unless it helps explain a tradeoff.
  • Focus on actionable findings, not generic style comments.

@kmdvs kmdvs temporarily deployed to footlight-condenser-test March 24, 2026 22:35 Inactive
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9920ddc445

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

new_url = Dsl::UrlResolver.extract(raw)
if new_url.blank?
Rails.logger.debug { "[DSL] skipped invalid URL from #{raw.inspect}" }
return skip
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail fast when URL resolution returns nil

When Dsl::UrlResolver.extract returns nil, this path returns :skip, which makes handle_url_step silently keep prior state (arr, @url, and @page) instead of failing. In pipelines where url=$array.first becomes invalid after filtering, later extraction steps run against stale page context and can write incorrect data while appearing successful. Return an abort_update payload (as the previous implementation did for invalid URLs) so refresh surfaces a real error instead of silently degrading.

Useful? React with 👍 / 👎.

flash[:notice] = "Statement was successfully refreshed."
end

redirect_to @statement
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore JSON responses for statement refresh

The action now unconditionally redirects, so PATCH /statements/:id/refresh.json no longer returns structured JSON success/error payloads and instead emits an HTML redirect. Any API caller relying on the documented .json endpoint will lose machine-readable refresh results and error details. Reintroduce respond_to handling so HTML keeps redirect behavior while JSON responses are returned with appropriate status codes.

Useful? React with 👍 / 👎.

def abort_structure?(obj)
obj.is_a?(Array) &&
obj.length == 2 &&
obj.first == "abort_update" &&
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle non-abort Wringer actions in client fetch

safe_wringer_call now supports policy actions beyond abort_update (for example, custom actions like skip), but WringerClient only treats ['abort_update', payload] as a control result. Any other action tuple is misclassified as a successful fetch body, so policy decisions are ignored and downstream parsing can run on non-HTML control payloads. Normalize and branch on the action token generically (or explicitly handle all supported actions) instead of hard-coding only abort_update.

Useful? React with 👍 / 👎.

Comment thread config/routes.rb
Comment on lines +36 to +38
get 'events/:id/pipeline_health',
to: "events#pipeline_health",
as: :event_pipeline_health
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept slash-containing IDs for pipeline health route

This route captures the event identifier in a standard :id segment, which does not safely support identifiers containing / (common for URI-shaped RDF IDs). Requests for such events will fail routing before reaching EventsController#pipeline_health, making the endpoint unusable for a subset of real identifiers. Use a query parameter or a glob segment (e.g., *id) so full URI-like IDs are routable.

Useful? React with 👍 / 👎.

kmdvs2 added 2 commits March 25, 2026 15:53
- Extract inline CSS and JS into dedicated asset files
- Add robust JS initialization (DOMContentLoaded + turbo:load)
- Scope JS execution to statements/show page
- Simplify view by relying on TracePresenter (visibility, trace state)
- Consolidate top-level locals and improve readability

No functional changes; improves separation of concerns and maintainability
- Extract inline CSS and JS into dedicated asset files
- Add robust JS initialization (DOMContentLoaded + turbo:load)
- Scope JS execution to statements/show page
- Simplify view by relying on TracePresenter (visibility, trace state)
- Consolidate top-level locals and improve readability

No functional changes; improves separation of concerns and maintainability
@kmdvs
Copy link
Copy Markdown
Author

kmdvs commented Mar 25, 2026

@codex You are a very senior Ruby on Rails engineer performing a deep code review.

Context:
This repository implements a DSL-based data extraction pipeline with the following flow:

Wringer (fetch) → DSL runner → pipeline diagnosis → trace → presenter → view

Recent changes include:

  • DSL runner hardening and state handling fixes
  • Introduction of structured signals and hints in network layer
  • Propagation of redirect metadata (final_url, redirect_chain)
  • Introduction of TracePresenter for UI logic (mode, visibility, error detection)
  • Refactoring of controller/view responsibilities
  • Extraction of CSS/JS and view cleanup

Your task:
Perform a deep, critical review focused on correctness, reliability, and architectural coherence.


REVIEW AREAS

  1. Data Contract Consistency (HIGH PRIORITY)
  • Verify all pipeline stages return consistent shapes:
    • scrape / scrape_using_post
    • DSL execution outputs
    • trace steps
  • Check that keys are always present or consistently absent:
    • html
    • signals (Hash)
    • hints (Array)
    • final_url (String or nil)
    • redirect_chain (Array)
  • Identify inconsistent return types (nil vs [] vs {})
  1. Trace Integrity (HIGH PRIORITY)
  • Verify trace step structure is consistent across:
    • success paths
    • error paths
    • partial execution
  • Detect:
    • symbol vs string key inconsistencies
    • mutation of shared objects
    • missing fields in edge cases
  • Ensure presenter assumptions match actual trace shape
  1. Error Handling & Failure Semantics (HIGH PRIORITY)
  • Identify:
    • rescue StandardError blocks that swallow errors silently
    • inconsistent signaling of failure vs success
    • cases where execution continues after failure incorrectly
  • Ensure:
    • failures produce meaningful signals/hints
    • no misleading “success” states with empty or partial data
  1. Wringer Contract Propagation (HIGH PRIORITY)
  • Verify final_url and redirect_chain are:
    • consistently set or explicitly nil/empty
    • preserved through DSL → trace → presenter
  • Check for:
    • duplication or inconsistent representation of redirect_chain
    • loss of metadata in intermediate steps
  1. Architecture Coherence (MEDIUM PRIORITY)
  • Verify separation of concerns:
    • Wringer: fetch + raw metadata only
    • DSL runner: execution only
    • Diagnosis: interpretation only
    • Presenter: UI-facing logic only
    • View: rendering only
  • Identify:
    • logic leaking into helpers or views
    • presenter doing normalization instead of interpretation
    • controller doing interpretation
  1. Reliability & Edge Cases (MEDIUM PRIORITY)
  • Identify:
    • partial state updates
    • inconsistent fallbacks
    • unsafe assumptions (presence of keys, types)
  • Look for:
    • nil dereferencing risks
    • unsafe URI handling
    • redirect edge cases
  1. Performance (LOW PRIORITY)
  • Identify:
    • repeated parsing or transformations
    • unnecessary allocations (e.g. repeated .to_h, .uniq)
    • redundant operations in loops

OUTPUT FORMAT

Return findings grouped by severity:

HIGH:

  • file
  • issue
  • why it matters
  • suggested fix

MEDIUM:

  • same format

LOW:

  • same format

IMPORTANT RULES

  • Focus only on real correctness, reliability, or architectural issues
  • Do NOT report style-only issues
  • Avoid trivial suggestions (naming, formatting)
  • Prioritize issues that could cause:
    • incorrect data
    • inconsistent behavior
    • debugging difficulty
    • future regressions

GOAL

The goal is to ensure:

  • deterministic pipeline behavior
  • consistent data contracts
  • clean separation of concerns
  • reliable error handling

This is not a lint — this is a production-grade system review.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5300a07a6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +236 to +247
step_payload[:w] = {
i: event[:wringer][:inherited],
et: event[:wringer][:error_type],
r: event[:wringer][:retry],
c: event[:wringer][:cache],
u: event[:wringer][:unreachable],
r404: event[:wringer][:received_404],
se: event[:wringer][:system_error],
pa: event[:wringer][:policy_action],
s: event[:wringer][:signals],
h: event[:wringer][:hints]
}.compact
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve redirect metadata in compact trace

The v2 session formatter drops final_url/redirect_chain when compacting wringer data, so redirect metadata emitted by the fetch layer is lost before it reaches StatementsHelper#interactive_redirect_info and the trace UI. In practice, a run can correctly detect/record a redirect in the runner, but after for_session_v2 + expand, the view cannot tell that redirection happened, which breaks the new redirect propagation contract and makes debugging navigation failures much harder.

Useful? React with 👍 / 👎.

Comment thread app/presenters/trace_presenter.rb Outdated
@@ -0,0 +1,52 @@
class TracePresenter
DEFAULT_MODE = 3
VALID_MODES = [1, 2, 3, 4].freeze
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept debug view mode cookie value 5

TracePresenter rejects mode 5, but the options flow explicitly sets trace_view_mode=5 for Debug and the statement view has a dedicated debug branch (is_debug). Because this whitelist excludes 5, selecting Debug always falls back to mode 3, so the debug rendering path is effectively unreachable for users.

Useful? React with 👍 / 👎.

Comment thread app/services/dsl/pipeline_evaluator.rb Outdated
end

def effective_wringer
DEFAULT_WRINGER.merge(@wringer.slice(:unreachable, :received_404, :system_error, :policy_action))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Forward full wringer context to diagnosis

PipelineEvaluator strips wringer input down to four flags before calling PipelineDiagnosis, dropping fields like error_type, signals, and hints that diagnosis logic relies on for fetch-specific classification/messages. When callers pass richer wringer status (for example network_status: "failed" or WringerFetchError), this truncation can misclassify failures as non-fetch issues or healthy states.

Useful? React with 👍 / 👎.

kmdvs2 added 5 commits March 25, 2026 16:34
- Preserve redirect metadata in trace formatter (fu, rc)
- Enable debug mode (accept trace_view_mode=5)
- Forward full wringer context to pipeline diagnosis
…ition

- introduce TracePresenter as single interpretation layer
- extract TraceFormatter for normalization
- unify semantic(step) and step_output usage
- decompose trace UI into partials (header, diagnosis, step)
- remove legacy trace_demo view

No behavior changes; improves consistency, safety, and maintainability
- improve DSL runner reliability
- refine pipeline diagnosis logic
- align tests with updated behavior

No functional regressions
- add WringerClient abstraction
- introduce shared source identity logic
- add tests for wringer integration
@saumier saumier temporarily deployed to footlight-condenser-test April 17, 2026 20:08 Inactive
- replace legacy flat DSL implementation with modular architecture:
  - Dsl::Core (algorithm execution)
  - Dsl::Tracing (trace collection and formatting)
  - Dsl::Parsing (content parsing)
  - Dsl::Support (url resolution, wringer client)
- remove legacy DSL classes (DslAlgorithmRunner, DslRunner, etc.)
- preserve DSL execution semantics and abort_update contract
- update tests to target new module structure

All existing DSL tests pass with no behavior change.
@kmdvs kmdvs temporarily deployed to footlight-condenser-test May 15, 2026 06:50 Inactive
@saumier saumier removed their assignment May 16, 2026
Copy link
Copy Markdown
Member

@saumier saumier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Kim. I am not able to review this. 385 file changes! Much too big and I don't understand enough to add value here.

@kmdvs kmdvs temporarily deployed to footlight-condenser-test May 20, 2026 16:48 Inactive
@kmdvs kmdvs temporarily deployed to footlight-condenser-test May 20, 2026 17:11 Inactive
@kmdvs kmdvs temporarily deployed to footlight-condenser-test May 20, 2026 20:12 Inactive
@kmdvs kmdvs temporarily deployed to footlight-condenser-test May 21, 2026 17:33 Inactive
@kmdvs
Copy link
Copy Markdown
Author

kmdvs commented May 21, 2026

@codex You are reviewing a Rails codebase on branch harden-dsl-reliability.

Context:
Distillator is close to being deployed as a second-production transition tool replacing legacy Wringer gradually. The current work focuses on transition IA/UX, promotion readiness, transition checks, cache diagnostics, and operator safety. The transition flow must help an operator understand whether a website can move from legacy Wringer production to shadow comparison to active Condenser production.

Recent changes already added:

  • Transition report detail page with operator-first structure:
    • Decision
    • Current checks
    • Primary blocker
    • Next action
    • Checked scope
    • collapsed Diagnostics
    • collapsed Audit
  • Causal transition check states:
    • passed
    • failed
    • missing
    • stale
    • not_evaluated
    • blocked_by_fetch
    • inconclusive
  • Real transition checks for representative webpages, statement refresh, and export comparison.
  • Actionable fetch blocker links:
    • Open failed cache result
    • Compare Condenser vs Wringer
    • Open active Wringer cache
    • Open Condenser cache
  • Flash messages now distinguish:
    • statements not evaluated
    • export blocked by fetch
    • inconclusive checks
  • Diagnostics should not contain transition-check buttons.
  • Operator context card should be suppressed where it adds empty/noisy sections.
  • Tests currently pass, including transition status/check/report/cache controller tests.

Important staging observations:

  1. Top nav currently starts:
    all websites | webpages | sources | statements | ...
    It should always show the current website name as a hyperlink immediately after all websites when website context is available:
    all websites | Centre de Création Diffusion de Gaspé | webpages | sources | statements | ...
    This matters because otherwise there is no obvious way back to the website page from transition/cache/detail pages.

  2. On Condenser cache detail, the Identity card currently shows:
    Normalized URL: https://...
    but it is plain text. It should be hyperlinked to the real source page with target="_blank" and rel="noopener" when it is an HTTP/HTTPS URL.

  3. A staging cache failure currently shows an empty_body-like diagnosis, but the trace shows the origin actually returned HTTP 200 HTML. The native fetch got HTML, then a Wringer/content rule matched generic_error_text, content_success became false, policy_action became abort_update, and the cache stored no HTML/body. This should not be explained as a simple empty body. The UI should distinguish:

    • actual origin empty body
    • origin returned HTML but content policy rejected it
    • local stored body empty because update was aborted
    • transport/network failure
  4. For the failing URL:
    https://centredecreationdiffusiondegaspe.com/programmation/koros-experiences-vr/
    the browser/source clearly contains valid WordPress HTML and JSON-LD. Therefore, if the cache detail says empty_body, review whether CacheHealthMaterializer, CacheHealth, or issue selection is incorrectly prioritizing empty_body over generic_error_text / content rejection.

Files to inspect first:

  • app/views/layouts/_header.html.erb
  • app/views/layouts/application.html.erb
  • app/helpers/application_helper.rb
  • app/helpers/operator_context_helper.rb
  • app/helpers/websites_helper.rb
  • app/views/shared/_transition_context.html.erb
  • app/views/distillator/shadow_reports/show.html.erb
  • app/helpers/distillator/shadow_reports_helper.rb
  • app/controllers/distillator/cache_controller.rb
  • app/views/distillator/cache/show.html.erb
  • app/helpers/distillator/cache_helper.rb
  • app/services/distillator/cache_link_resolver.rb
  • app/services/distillator/fetch_cache_store.rb
  • app/services/distillator/native_fetch.rb
  • app/services/distillator/cache_health_materializer.rb
  • app/services/distillator/cache_health.rb
  • app/services/distillator/wringer_issue_set.rb
  • app/services/distillator/wringer_rules.rb
  • config/distillator/wringer_rules.yml or equivalent rules file
  • test/controllers/distillator/cache_controller_test.rb
  • test/controllers/distillator/shadow_reports_controller_test.rb
  • test/controllers/websites_controller_test.rb
  • test/integration/distillator_transition_copy_test.rb
  • test/integration/harmonized_ui_inventory_test.rb
  • test/services/distillator/cache_health_materializer_test.rb
  • test/services/distillator/fetch_cache_store_test.rb
  • test/services/distillator/native_fetch_test.rb
  • test/services/distillator/wringer_rules_test.rb

Review goals:

  1. Verify the transition flow is safe and understandable for a production operator.
  2. Find code smells, brittle state derivation, duplicated readiness logic, or misleading labels.
  3. Confirm the UI does not overstate activation safety.
  4. Confirm the UI gives the operator a direct diagnostic path when fetch/cache blocks activation.
  5. Confirm no read-only report/detail page accidentally fetches, refreshes, mutates rollout mode, or performs network work.
  6. Confirm causal states are consistent across flash, current checks, primary blocker, audit table, diagnostics, and promotion contract.
  7. Confirm cache failures distinguish origin response from local storage decision.
  8. Confirm the top navigation always exposes the current website page when context exists.
  9. Confirm normalized source URLs are clickable on cache detail.
  10. Confirm no obsolete “preview”, “Phase I”, or internal-only transition wording leaks into operator-facing views.

Specific review questions:

  • Is transition_context_website reliable enough for all relevant pages: transition report detail, cache detail, webpages, sources, statements, website pages?
  • Is there a risk of wrong website context from stale cookies[:seedurl] overriding a more specific object context?
  • Does top nav add the current website link in the right place without duplicating links or breaking pages with no website context?
  • Does suppressing operator context on transition detail accidentally hide useful status/actions on other pages?
  • Are direct cache links generated from the correct failing URL/cache row?
  • Does shadow_report_cache_diagnostic_links avoid rendering buttons/forms in Diagnostics?
  • Is Run transition check rendered only in the Next action section on transition detail?
  • Are causal labels consistent:
    • not evaluated
    • blocked by fetch
    • inconclusive
    • failed
    • stale
    • missing
  • Is TransitionCheckRunner::Result#flash_message mapping dependency reasons correctly?
  • Does the Audit table avoid calling dependency evidence “missing”?
  • Are TransitionEvidenceExplanation and TransitionStatus duplicating state logic in a way that can drift?
  • Is fetch failure causality correctly modeled:
    • fetch failed directly
    • statement check skipped because fetch failed
    • export check skipped because fetch failed
  • Is export comparison intentionally blocked by fetch? If yes, does the UI clearly say “not compared in this run” rather than implying export failed?
  • Does the cache detail explain whether HTML was actually fetched but rejected, versus never fetched or empty from origin?
  • Is empty_body incorrectly prioritized when a non-empty 200 HTML response matched generic_error_text?
  • Does the issue matching rule for generic_error_text seem too broad for normal WordPress pages?
  • Does issue metadata include enough matched-rule context for an operator/developer to debug false positives?
  • Are fetched-body bytes / stored-body bytes / policy action exposed without adding unnecessary DB columns?
  • Are all new links safe against non-HTTP URLs and open external pages with rel="noopener"?
  • Are all helper outputs escaped safely?

Expected improvements to recommend if missing:

  1. Top nav:
    Add current website link immediately after all websites when transition_context_website is present.

  2. Cache detail:
    Render normalized URL as a safe external link for HTTP/HTTPS URLs.

  3. Cache diagnosis:
    Add clear wording:
    Origin returned HTTP 200 HTML, but content policy rejected it. Stored cache body is empty because update was aborted.
    Do not label that as simple empty_body.

  4. Issue matching:
    If generic_error_text fires, expose rule id/pattern/snippet in capped metadata so false positives are diagnosable.

  5. Transition detail:
    Primary blocker should show:

    • URL
    • issue key
    • health summary
    • latest attempt
    • latest successful refresh
    • direct cache links
    • concise explanation of origin response vs stored cache body
  6. Tests:
    Add or verify tests for:

    • top nav website link on transition detail and cache detail
    • no website link when no context exists
    • normalized URL is clickable on cache detail
    • non-HTTP normalized URLs are not linked
    • 200 HTML rejected by generic_error_text does not display as simple empty body
    • actual zero-byte origin response still displays empty body
    • transition blocker mirrors improved cache issue label
    • no network/fetch happens on report/detail pages
    • no transition check button inside Diagnostics
    • causal labels remain consistent across flash, current checks, audit table

Output requested:

  • Give a prioritized review:
    • P0: must fix before staging/prod transition testing
    • P1: should fix before management/operator demo
    • P2: cleanup / hardening
  • For each finding, include:
    • affected files
    • why it matters operationally
    • concrete suggested change
    • suggested test coverage
  • Do not implement broad unrelated refactors.
  • Keep changes focused, TDD-first, and compatible with existing rollout semantics.
  • Preserve current passing tests.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 786b0e7bc8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +118 to +121
wringer: Distillator::FetchService.build_wringer_status(control, raw_response),
http_code: nil,
raw_body: nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve raw HTML when Wringer aborts update

Keep the original response body in raw_body for abort-control responses; right now control_result sets raw_body: nil, so downstream enrichment/store logic treats policy rejections as bodyless fetches, sets empty_body signals, and can misclassify a 200 HTML fetch rejected by content policy as an origin-empty failure. This directly affects cache diagnostics and transition blocking explanations for operator decisions. Populate raw_body from raw_response.body (when available) and add regression tests covering “HTTP 200 + non-empty HTML + abort_update” to ensure empty_body is not emitted.

Useful? React with 👍 / 👎.

Comment thread config/wringer.yml Outdated
Comment on lines +495 to +497
body_contains:
- "Error"
- "Erreur"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict generic_error_text match criteria

Tighten this rule because matching any page containing "Error" or "Erreur" is overly broad and will flag many valid pages, which then drives abort_update/cache: false policy and blocks refreshes despite successful transport. In production this can create widespread false failures and misleading operator blockers. Add stricter predicates (e.g., scoped phrases/selectors and/or HTTP/status constraints) plus tests showing a normal WordPress HTML page with incidental “error” text does not trigger this rule.

Useful? React with 👍 / 👎.

Comment thread app/views/layouts/_header.html.erb Outdated
Comment on lines +1 to +3
<%= link_to "all websites", websites_path %> |
<%= link_to "webpages", webpages_path %> |
<%= link_to "sources", sources_path %> |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore current-website link in top navigation

Reintroduce the contextual website breadcrumb immediately after “all websites” when website context exists; this change removed the website link entirely, so operators on transition/cache/detail pages lose the direct route back to the active website and can more easily navigate in the wrong context during rollout decisions. Render a context-aware link (and omit it when no context is available), then add integration coverage for transition detail and cache detail navigation.

Useful? React with 👍 / 👎.

Comment on lines +271 to +275
if row.website.distillator_mode == "shadow"
links << {
kind: :button,
label: "Run transition check",
url: distillator_transition_checks_path,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove transition-check POST action from diagnostics links

Avoid emitting mutating buttons from this shared action-link helper because it is used inside the Transition Report “Diagnostics” section; as written, shadow_report_actions injects a Run transition check button_to, which makes a diagnostic/read-only area perform writes and contradicts operator-safety expectations. Keep mutating actions only in the dedicated “Next action” area (or add a links-only mode for diagnostics), and add a view test asserting no form/button is rendered in Diagnostics.

Useful? React with 👍 / 👎.

{
title: "Identity card",
rows: [
{ label: "Normalized URL", value: @cache_payload[:normalized_url], value_class: "cache-normalized-url" },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Link normalized URL to source page with safe external attrs

Render the normalized URL as a clickable external link for HTTP/HTTPS values instead of plain text so operators can immediately verify source content when diagnosing cache failures; the current text-only field slows triage and obscures origin-vs-cache comparisons. Add scheme validation and render with target="_blank" and rel="noopener", plus tests for HTTP/HTTPS link rendering and non-HTTP values remaining plain text.

Useful? React with 👍 / 👎.

@kmdvs
Copy link
Copy Markdown
Author

kmdvs commented May 21, 2026

@codex

@kmdvs kmdvs temporarily deployed to footlight-condenser-test May 21, 2026 19:36 Inactive
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc701d8180

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +171 to +174
trace = session[:dsl_trace]
trace = trace.to_h if trace.respond_to?(:to_h)
trace = nil if trace == {}
@trace = safe_trace_copy(trace)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Scope trace display to the current statement

The show action always reads session[:dsl_trace] and renders it without checking which statement produced it. Since refresh writes a single shared session trace, a user can refresh statement A and then open statement B and see A’s trace attached to B, which can lead to incorrect debugging and operator decisions. Persist the statement id with the trace and ignore/clear traces that do not match @statement.id.

Useful? React with 👍 / 👎.

message = "Webpage not found for URL: #{params[:url]}"

respond_to do |format|
format.html { redirect_to webpage_statements_path(url: params[:url]), alert: message }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid redirecting missing webpage refresh to a crashing route

When a refresh is requested for a URL that has no Webpage, this HTML branch redirects to webpage_statements_path with the same missing URL. That target action still dereferences webpage.statements without a nil guard, so the redirect turns the not-found case into a 500 instead of a graceful error response. Redirect to a safe page (or render a 404 page directly) for this branch.

Useful? React with 👍 / 👎.

Comment on lines +327 to +333
result = @dsl_binding.eval(sub(code, arr))

# sync back DSL state
@url = Thread.current[:dsl_url]
@json = Thread.current[:dsl_json]

result
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve mutated $array when ruby step returns nil

The ruby step now returns only eval’s return value, but many in-place Ruby operations can return nil when no change occurs (for example uniq!). In that case the pipeline state becomes nil even though Thread.current[:dsl_array] still contains the intended array, and downstream steps run against the wrong state. Return the updated thread-local array as a fallback so mutating ruby steps preserve pipeline data.

Useful? React with 👍 / 👎.

Comment on lines +70 to +77
build_session_v2(
normalized,
code_limit: 20,
error_code_limit: 80,
output_limit: 20,
initial_limit: 20,
include_duration: false
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce session budget on final v2 trace fallback

for_session_v2 checks MAX_SESSION_BYTES for the first two compaction attempts but returns the third payload unconditionally. Large traces (especially with many steps and wringer metadata) can still exceed the configured session budget and then be written to session[:dsl_trace], risking cookie/session overflow failures during refresh. Re-check size after the final fallback and drop or aggressively trim if still too large.

Useful? React with 👍 / 👎.

Comment on lines +275 to +276
s: event[:wringer][:signals],
h: event[:wringer][:hints],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Truncate wringer payload fields before session compaction

The v2 session formatter copies event[:wringer][:signals] and event[:wringer][:hints] directly into each compacted step without any size limits. Those diagnostics can be large nested payloads, so traces with only a few steps can still blow up session size and trigger cookie overflow despite code/output truncation. Summarize or cap these fields before assigning them into step_payload[:w].

Useful? React with 👍 / 👎.

Comment thread app/javascript/operator_context_card.js Outdated

card.dataset.operatorContextInitialized = "true";

const storedCardOpen = window.localStorage.getItem(CARD_KEY);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Guard localStorage reads in operator context bootstrap

This direct localStorage read is outside any safety wrapper, so in environments where storage access is blocked (for example strict privacy settings or embedded contexts) it can raise SecurityError and abort the initializer. When that happens the operator context behavior breaks and can prevent later JS in the same load path from running. Use the same guarded pattern as readDomainState for all storage reads/writes.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants