Skip to content

fix: March 2026 integration test fixes — 4 councils repaired + diagnostics#1899

Open
robbrad wants to merge 3 commits intomasterfrom
march-2026-release
Open

fix: March 2026 integration test fixes — 4 councils repaired + diagnostics#1899
robbrad wants to merge 3 commits intomasterfrom
march-2026-release

Conversation

@robbrad
Copy link
Copy Markdown
Owner

@robbrad robbrad commented Mar 28, 2026

Summary

Ran full integration test suite (334 councils) and fixed 4 broken scrapers. Also improved the test results map and diagnostics tooling.

Fixes (4 councils, all tested and passing)

  • BaberghDistrictCouncil — date parsing crashed on multi-date strings (e.g. Fri 27 Mar 2026, Mon 27 Apr 2026). Now splits on comma and parses each date.
  • MidSuffolkDistrictCouncil — same date parsing bug (shared pattern with Babergh).
  • RotherDistrictCouncil — crashed when API returned "No data found" instead of a date. Now skips entries with missing/invalid dates.
  • WirralCouncilinput.json used paon key but test harness expects house_number. Renamed the key.

Tooling Improvements

  • generate_map_test_results.py — added --detailed flag that outputs error categories and summaries per council.
  • map.html — now loads detailed test results JSON, shows module name, error category, and error summary in popups for failed councils.

Remaining 46 Failures (diagnosed, documented in integration_test_results.md)

  • 26 upstream issues (sites down, blocking requests, Selenium timeouts)
  • 7 stale test data (UPRNs returning no data)
  • 8 site redesigns needing scraper rewrites
  • 5 need deeper investigation

Test Results

  • Before: 284/334 passing (85%)
  • After: 288/334 passing (86.2%)

Summary by CodeRabbit

  • Documentation

    • Added integration test report documenting test run results, fixes applied, and categorized failure root causes.
  • Bug Fixes

    • Fixed date parsing robustness for several councils to handle multiple collection dates and missing data gracefully.
    • Corrected test input parameter naming.
  • New Features

    • Enhanced test results map with detailed failure categorization and error summaries.
    • Improved fallback loading for test result data.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 28, 2026

📝 Walkthrough

Walkthrough

This PR documents integration test results from 2026-03-28 and implements corresponding infrastructure changes. It updates test result visualization to display detailed error categorization, fixes test data for one council, and improves three council scrapers to handle multiple collection dates and missing data robustly.

Changes

Cohort / File(s) Summary
Test Results & Documentation
integration_test_results.md
New integration test report documenting 4 applied fixes, 46 remaining failures categorized by root cause (upstream issues, test data problems, scraper rewrites needed, code issues), and passing council counts.
Test Infrastructure
uk_bin_collection/tests/generate_map_test_results.py, uk_bin_collection/map.html
Added error categorization function and updated test result parsing to support detailed output format with error categories, summaries, and durations. Map visualization now fetches and displays detailed test results with enhanced popup information including error details and module names.
Test Data Correction
uk_bin_collection/tests/input.json
Updated WirralCouncil test input: renamed parameter from paon to house_number.
Council Parser Enhancements
uk_bin_collection/uk_bin_collection/councils/BaberghDistrictCouncil.py, MidSuffolkDistrictCouncil.py, RotherDistrictCouncil.py
Enhanced date parsing to handle comma-separated collection dates, emitting multiple bin entries per type. Added robustness for missing or placeholder date values with early-skip logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • dp247

Poem

🐰 Hops through test results with glee,
Parsing dates—now two or three!
Missing data? Skip with care,
Scraper fixes everywhere!
Maps now show what failed and why,

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: four councils repaired as part of the March 2026 integration test fixes, with improved diagnostics. It is concise, specific, and clearly summarizes the primary contribution.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch march-2026-release

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (ab36d75) to head (d904a91).
⚠️ Report is 2 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1899   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@uk_bin_collection/map.html`:
- Around line 105-128: The popup content currently injects unescaped fields
(e.g., tr.error_summary, tr.error_category, name, wiki.module, wiki.wiki_name)
into the HTML passed to layer.bindPopup, allowing arbitrary HTML injection; add
an HTML-escaping helper (e.g., htmlEscape) and apply it to all interpolated
user-sourced values used when building statusText/extra and the final popup
string (references: tr.error_summary, tr.error_category, name, wiki.module,
wiki.wiki_name, and the layer.bindPopup call) so that bindPopup receives only
escaped text; alternatively construct the popup via a DOM element and use
setContent with text nodes to avoid HTML interpretation.

In `@uk_bin_collection/uk_bin_collection/councils/RotherDistrictCouncil.py`:
- Around line 68-69: The current check in RotherDistrictCouncil (inside the
method that iterates collection rows) uses "if not date or 'no data' in
date.lower(): continue", which silently skips empty/missing dates and can mask
page-structure changes; change this so only the explicit upstream sentinel ("no
data" case) is ignored, while truly missing or empty dates cause an explicit
failure: keep the "'no data' in date.lower()" branch to continue, but replace
the "not date" branch with an explicit error/exception (or processLogger.error +
raise) so the scraper fails fast when date is missing; update the relevant
method in class RotherDistrictCouncil where date is parsed (the loop referencing
the date variable) to implement this behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0ebf340b-ef03-4d79-98a4-e3c28811cd3f

📥 Commits

Reviewing files that changed from the base of the PR and between 60bd3cc and d904a91.

📒 Files selected for processing (7)
  • integration_test_results.md
  • uk_bin_collection/map.html
  • uk_bin_collection/tests/generate_map_test_results.py
  • uk_bin_collection/tests/input.json
  • uk_bin_collection/uk_bin_collection/councils/BaberghDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/MidSuffolkDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/RotherDistrictCouncil.py

Comment on lines +105 to +128
let statusText, extra = '';
if (status === 'pass') {
statusText = '✅ Covered (Test Passed)';
} else if (status === 'fail') {
statusText = '🟠 Covered (Test Failed)';
if (tr?.error_category) {
const cat = tr.error_category.replace(/_/g, ' ');
extra = `<br>Error: <em>${cat}</em>`;
}
if (tr?.error_summary) {
const summary = tr.error_summary.length > 100
? tr.error_summary.substring(0, 100) + '…'
: tr.error_summary;
extra += `<br><small>${summary}</small>`;
}
} else {
statusText = '✅ Covered (No test result)';
}
layer.bindPopup(
`<strong>${name}</strong><br>` +
`Module: <code>${wiki.module || ''}</code><br>` +
`Status: ${statusText}${extra}<br>` +
`<a href="${wiki.url}" target="_blank">📘 ${wiki.wiki_name}</a>`
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Escape popup content before building HTML.

tr.error_summary is sourced from JUnit failure text. Injecting it directly into bindPopup() turns the popup into an HTML sink, so scraper error messages can render arbitrary markup in the map artifact.

🛡️ Proposed fix
+    const escapeHtml = str => String(str)
+      .replace(/&/g, '&amp;')
+      .replace(/</g, '&lt;')
+      .replace(/>/g, '&gt;')
+      .replace(/"/g, '&quot;')
+      .replace(/'/g, '&#39;');
+
...
-                    extra = `<br>Error: <em>${cat}</em>`;
+                    extra = `<br>Error: <em>${escapeHtml(cat)}</em>`;
...
-                    extra += `<br><small>${summary}</small>`;
+                    extra += `<br><small>${escapeHtml(summary)}</small>`;
...
-                  `<strong>${name}</strong><br>` +
-                  `Module: <code>${wiki.module || ''}</code><br>` +
+                  `<strong>${escapeHtml(name)}</strong><br>` +
+                  `Module: <code>${escapeHtml(wiki.module || '')}</code><br>` +
                   `Status: ${statusText}${extra}<br>` +
-                  `<a href="${wiki.url}" target="_blank">📘 ${wiki.wiki_name}</a>`
+                  `<a href="${wiki.url}" target="_blank" rel="noopener noreferrer">📘 ${escapeHtml(wiki.wiki_name)}</a>`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let statusText, extra = '';
if (status === 'pass') {
statusText = '✅ Covered (Test Passed)';
} else if (status === 'fail') {
statusText = '🟠 Covered (Test Failed)';
if (tr?.error_category) {
const cat = tr.error_category.replace(/_/g, ' ');
extra = `<br>Error: <em>${cat}</em>`;
}
if (tr?.error_summary) {
const summary = tr.error_summary.length > 100
? tr.error_summary.substring(0, 100) + '…'
: tr.error_summary;
extra += `<br><small>${summary}</small>`;
}
} else {
statusText = '✅ Covered (No test result)';
}
layer.bindPopup(
`<strong>${name}</strong><br>` +
`Module: <code>${wiki.module || ''}</code><br>` +
`Status: ${statusText}${extra}<br>` +
`<a href="${wiki.url}" target="_blank">📘 ${wiki.wiki_name}</a>`
);
const escapeHtml = str => String(str)
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/"/g, '&quot;')
.replace(/'/g, '&#39;');
let statusText, extra = '';
if (status === 'pass') {
statusText = '✅ Covered (Test Passed)';
} else if (status === 'fail') {
statusText = '🟠 Covered (Test Failed)';
if (tr?.error_category) {
const cat = tr.error_category.replace(/_/g, ' ');
extra = `<br>Error: <em>${escapeHtml(cat)}</em>`;
}
if (tr?.error_summary) {
const summary = tr.error_summary.length > 100
? tr.error_summary.substring(0, 100) + '…'
: tr.error_summary;
extra += `<br><small>${escapeHtml(summary)}</small>`;
}
} else {
statusText = '✅ Covered (No test result)';
}
layer.bindPopup(
`<strong>${escapeHtml(name)}</strong><br>` +
`Module: <code>${escapeHtml(wiki.module || '')}</code><br>` +
`Status: ${statusText}${extra}<br>` +
`<a href="${wiki.url}" target="_blank" rel="noopener noreferrer">📘 ${escapeHtml(wiki.wiki_name)}</a>`
);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/map.html` around lines 105 - 128, The popup content
currently injects unescaped fields (e.g., tr.error_summary, tr.error_category,
name, wiki.module, wiki.wiki_name) into the HTML passed to layer.bindPopup,
allowing arbitrary HTML injection; add an HTML-escaping helper (e.g.,
htmlEscape) and apply it to all interpolated user-sourced values used when
building statusText/extra and the final popup string (references:
tr.error_summary, tr.error_category, name, wiki.module, wiki.wiki_name, and the
layer.bindPopup call) so that bindPopup receives only escaped text;
alternatively construct the popup via a DOM element and use setContent with text
nodes to avoid HTML interpretation.

Comment on lines +68 to +69
if not date or "no data" in date.lower():
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't swallow missing dates here.

"No data found" is a valid upstream sentinel, but not date also covers missing spans / empty text when the page shape changes. Silently continue-ing there can return a partial schedule instead of failing the scraper.

🔧 Proposed fix
-                if not date or "no data" in date.lower():
-                    continue
+                if not date:
+                    raise ValueError(
+                        f"Missing collection date for {bin_type!r} at UPRN {user_uprn}"
+                    )
+                if "no data" in date.lower():
+                    continue

Based on learnings: when parsing council bin collection data, prefer explicit failures over silent defaults so format changes are detected early.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/RotherDistrictCouncil.py` around
lines 68 - 69, The current check in RotherDistrictCouncil (inside the method
that iterates collection rows) uses "if not date or 'no data' in date.lower():
continue", which silently skips empty/missing dates and can mask page-structure
changes; change this so only the explicit upstream sentinel ("no data" case) is
ignored, while truly missing or empty dates cause an explicit failure: keep the
"'no data' in date.lower()" branch to continue, but replace the "not date"
branch with an explicit error/exception (or processLogger.error + raise) so the
scraper fails fast when date is missing; update the relevant method in class
RotherDistrictCouncil where date is parsed (the loop referencing the date
variable) to implement this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant