Skip to content

Enforce PDF/A-3 for ZUGFeRD / Factur-X hybrid PDFs #262

@dafrose

Description

@dafrose

Problem Statement

Hybrid Factur-X / ZUGFeRD invoices (EN 16931, EXTENDED, and BASIC when CII is embedded in the PDF) must be delivered in a PDF/A-3 container (ISO 19005-3), not a normal PDF 1.x with XML attached. That requirement is part of the Factur-X / ZUGFeRD specifications, not only our XML Schematron checks.

eu_einvoice already tries to convert print PDFs with Ghostscript before embedding XML (_convert_pdf_to_pdfa in attach_xml_to_pdf). Today that step is best-effort:

  • If gs is missing or conversion fails, embedding continues on a non–PDF/A PDF and only an Error Log is written.
  • There is no site setting to require PDF/A-3 or to block/warn on submit or download.
  • Operators must install Ghostscript and ICC profiles separately; README documents this, but compliance is not guaranteed.

Recipients and validators can therefore get a PDF that passes CII validation but is not a conformant hybrid Factur-X/ZUGFeRD file.

Suggested Solution

  1. Treat PDF/A-3 as a first-class requirement for hybrid profiles (EN 16931, EXTENDED, BASIC when invoice XML is embedded in the PDF). XRECHNUNG unchanged (no invoice XML in PDF; standalone .xml path only).

  2. E Invoice Settings — new option(s), aligned with existing error_action_on_save / error_action_on_submit style, e.g.:

    • Require PDF/A-3 for hybrid PDF (Check), default on for new sites (or document migration default).
    • When enabled and conversion is unavailable or fails: Error Message or Warning Message on the relevant action (submit PDF / download_pdf / hook path), instead of silently continuing.
    • Optional: PDF/A-3 flavour (e.g. 3b) fixed or selectable if we expose it.
  3. Single conversion step on plain print bytes only (before factur-x / drafthorse embed), consistent with the planned multi-annex pipeline (Feature: Multi-select annexes on Sales Invoice (PDF/A-3 embed + CII 916 XML embed) #261): do not run PDF/A conversion on PDFs that already contain embedded files.

  4. Converter abstraction — refactor _convert_pdf_to_pdfa behind a small interface used by attach_xml_to_pdf and future before_attach_pdf handler:

    • Default / fallback: keep Ghostscript where gs is available (current behaviour, fastest on test sample ~0.09 s).
    • Optional: pdftopdfa (GitHub, PyPI) as pip dependency — no system gs, MPL-2.0, slower than Ghostscript but within reason (~0.50 s on test sample). Setting: PDF/A converter = Ghostscript | pdftopdfa | Auto (try GS, else pdftopdfa if installed).
  5. Docs & ops: README/install notes — when enforcement is on, at least one converter must be available; document Python version constraint for pdftopdfa (3.12+). Optional dev/CI check with veraPDF (out of scope for v1 if too heavy).

Acceptance Criteria

  • Hybrid-profile PDF output is PDF/A-3 when enforcement is on; failure surfaces per settings (no silent non–PDF/A hybrid).
  • Setting documented on E Invoice Settings; respects existing error-action patterns where applicable.
  • GS and/or pdftopdfa selectable; Auto documented.
  • XRECHNUNG behaviour unchanged.
  • Tests: enforcement on + converter missing → error/warn; successful conversion path (mock or fixture).

Alternatives

Alternative Notes
Keep best-effort GS only Status quo; no compliance guarantee
Always require GS, never pip dep Simplest ops story for GS-only shops; fails on hosts without gs
pdftopdfa only, drop Ghostscript Simpler install (pip install pdftopdfa); new dependency, younger project, slower on test sample
Validate with veraPDF in production Strongest guarantee; heavy (veraPDF CLI, CI cost) — defer or optional
Enforce only in tests/docs No protection for real submit/download

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions