Skip to content

fix: duplicate cover image on imported posts#13

Merged
nycterent merged 1 commit into
mainfrom
fix/strip-cover-after-sanitize
Jun 19, 2026
Merged

fix: duplicate cover image on imported posts#13
nycterent merged 1 commit into
mainfrom
fix/strip-cover-after-sanitize

Conversation

@nycterent

Copy link
Copy Markdown
Contributor

QA of /archive/nieko-nedarau-o-turiu-ko-noriu/ found the cover image rendered twice (template cover + a copy at the top of the body).

Cause: imported posts are email-table HTML — the cover <img> is wrapped in <tr id="content-blocks">, so the leading-cover strip (which expected a leading <figure>/<img>) missed it; the sanitizer then unwrapped the <tr>, leaving a duplicate.

Fix: sanitize first, then strip the body's copy of the cover by matching the cover URL anywhere (not just position 0). render.article(body, cover) now takes the cover; post_page passes post['image']. Verified 0 duplicate covers across all 23 posts; regression test added. 20 tests green. Squash-merge.

Imported (email-HTML) posts wrap the cover in <tr> table markup, so the
leading-cover strip missed it and the cover showed twice (template +
body). Now sanitize first, then strip the body's copy of the cover by
matching the cover URL anywhere (not just the leading node). article()
takes the cover URL; post_page passes it. Verified 0 duplicate covers
across all 23 posts + regression test.
Copilot AI review requested due to automatic review settings June 19, 2026 21:52
@nycterent nycterent merged commit 9107f34 into main Jun 19, 2026
2 checks passed
@nycterent nycterent deleted the fix/strip-cover-after-sanitize branch June 19, 2026 21:54

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes duplicate cover images in archived posts (notably imported email-table HTML) by adjusting the render pipeline so the body copy of the cover is removed after sanitization, and by threading the cover URL into the article renderer.

Changes:

  • Update render.article() to accept a cover URL, sanitize first, then remove a duplicate cover image from the body.
  • Pass post["image"] into render.article() from the post page template.
  • Add a regression test and regenerate affected archive HTML pages to remove the duplicated cover image.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_render.py Adds regression test for duplicate cover removal in imported email-table HTML.
render.py Adds _strip_cover() and updates article() pipeline to sanitize before stripping duplicate cover.
pages.py Passes post cover URL into render.article() so duplicate stripping can be accurate.
archive/sauga/index.html Regenerated output: removes duplicated cover image from body.
archive/rand-paradoksas/index.html Regenerated output: removes duplicated cover image from body.
archive/nieko-nedarau-o-turiu-ko-noriu/index.html Regenerated output: removes duplicated cover image from body.
archive/knygskaitys/index.html Regenerated output: removes duplicated cover image from body.
archive/ivairios-suvestines/index.html Regenerated output: removes duplicated cover image from body.
archive/darbuotoju-atranka/index.html Regenerated output: removes duplicated cover image from body.
archive/darbo-imitacijos-rinka/index.html Regenerated output reflecting updated sanitizer/strip order (no duplicate cover).
archive/dalinuosi-dar-vieno-analitiko-mintimis/index.html Regenerated output: removes duplicated cover image from body.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread render.py
Comment on lines +94 to +96
name = re.escape(cover.rsplit("/", 1)[-1])
img = re.compile(r"<img\b[^>]*" + name + r"[^>]*>", re.IGNORECASE)
new = img.sub("", html, count=1)
Comment thread render.py
Comment on lines +100 to +101
new = re.sub(r"<p>\s*</p>", "", new, count=1)
new = re.sub(r"<figure>\s*(?:<figcaption\b[^>]*>.*?</figcaption>\s*)?</figure>", "", new, count=1, flags=re.S)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants