Skip to content

Fix PDF relabel rewrite for nested fields#162

Merged
nonprofittechy merged 1 commit intomainfrom
fix/pdf-relabel-nested-fields
Mar 19, 2026
Merged

Fix PDF relabel rewrite for nested fields#162
nonprofittechy merged 1 commit intomainfrom
fix/pdf-relabel-nested-fields

Conversation

@nonprofittechy
Copy link
Copy Markdown
Member

Summary

  • replace parse_form(..., rewrite=True)'s top-level field-index rewrite with an explicit old-to-new mapping
  • apply renames through rename_pdf_fields(...) so nested AcroForm trees are updated correctly
  • add a regression test covering rewrite mode with LLM-generated names

Testing

  • pytest -q /tmp/FormFyxer-pr/formfyxer/tests/test_lit_explorer_pdf_labeling.py

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b70d17278

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates parse_form(..., rewrite=True) to rename PDF fields using an explicit old→new mapping and the shared rename_pdf_fields(...) helper, ensuring nested AcroForm field trees are updated correctly.

Changes:

  • Replace index-based AcroForm field rewriting with an old-to-new rename mapping.
  • Apply renames via rename_pdf_fields(...) (nested-field aware) during rewrite mode.
  • Add a regression test asserting rewrite mode calls the rename helper with stripped (no *) target names.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
formfyxer/lit_explorer.py Switch rewrite implementation to build a rename mapping and delegate updates to rename_pdf_fields(...).
formfyxer/tests/test_lit_explorer_pdf_labeling.py Add regression test verifying rewrite mode uses rename_pdf_fields(...) with the expected mapping for nested fields.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nonprofittechy nonprofittechy force-pushed the fix/pdf-relabel-nested-fields branch from 5b70d17 to 680eb8b Compare March 19, 2026 19:02
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 680eb8b5a1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +147 to +149
if len(flattened_fields) != len(field_names):
raise ValueError(
"PDF field traversal count did not match parsed field-name count"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid aborting rewrites when AcroForm contains filtered widgets

In parse_form(), field_names comes from get_existing_pdf_fields(), which skips template widgets and fields whose page cannot be resolved (formfyxer/pdf_wrangling.py:536-552). _rewrite_pdf_fields_in_place() rebuilds flattened_fields directly from _unnest_pdf_fields() without applying those filters, then raises on any count mismatch. On PDFs that contain one of those skipped widgets, rewrite=True now aborts the entire rename pass even though the visible fields were parsed successfully; this helper needs to use the same filtered field set instead of treating the extra entries as fatal.

Useful? React with 👍 / 👎.

@nonprofittechy
Copy link
Copy Markdown
Member Author

Tested, confirmed this fixes issue with labeling
arizona_dopa-with-fields.pdf

@nonprofittechy nonprofittechy merged commit a019f21 into main Mar 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants