Fix PDF relabel rewrite for nested fields#162
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b70d17278
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Updates parse_form(..., rewrite=True) to rename PDF fields using an explicit old→new mapping and the shared rename_pdf_fields(...) helper, ensuring nested AcroForm field trees are updated correctly.
Changes:
- Replace index-based AcroForm field rewriting with an old-to-new rename mapping.
- Apply renames via
rename_pdf_fields(...)(nested-field aware) during rewrite mode. - Add a regression test asserting rewrite mode calls the rename helper with stripped (no
*) target names.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
formfyxer/lit_explorer.py |
Switch rewrite implementation to build a rename mapping and delegate updates to rename_pdf_fields(...). |
formfyxer/tests/test_lit_explorer_pdf_labeling.py |
Add regression test verifying rewrite mode uses rename_pdf_fields(...) with the expected mapping for nested fields. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5b70d17 to
680eb8b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 680eb8b5a1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if len(flattened_fields) != len(field_names): | ||
| raise ValueError( | ||
| "PDF field traversal count did not match parsed field-name count" |
There was a problem hiding this comment.
Avoid aborting rewrites when AcroForm contains filtered widgets
In parse_form(), field_names comes from get_existing_pdf_fields(), which skips template widgets and fields whose page cannot be resolved (formfyxer/pdf_wrangling.py:536-552). _rewrite_pdf_fields_in_place() rebuilds flattened_fields directly from _unnest_pdf_fields() without applying those filters, then raises on any count mismatch. On PDFs that contain one of those skipped widgets, rewrite=True now aborts the entire rename pass even though the visible fields were parsed successfully; this helper needs to use the same filtered field set instead of treating the extra entries as fatal.
Useful? React with 👍 / 👎.
|
Tested, confirmed this fixes issue with labeling |
Summary
parse_form(..., rewrite=True)'s top-level field-index rewrite with an explicit old-to-new mappingrename_pdf_fields(...)so nested AcroForm trees are updated correctlyTesting
pytest -q /tmp/FormFyxer-pr/formfyxer/tests/test_lit_explorer_pdf_labeling.py