feat: Implement robust report structure management class #401
Open
Calebnzm wants to merge 1 commit intofireform-core:mainfrom
Open
feat: Implement robust report structure management class #401Calebnzm wants to merge 1 commit intofireform-core:mainfrom
Calebnzm wants to merge 1 commit intofireform-core:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a Report Schema management system — a new abstraction layer that sits above individual form templates. It allows users to define a canonical schema that aggregates and unifies fields from multiple PDF templates into a single, structured report definition, enabling consistent data extraction and output generation across varied document formats.
Motivation
Previously, templates were managed in isolation. There was no way to:
This PR introduces all of that as a first-class, database-backed system.
Workflow
Create a Report Schema — A schema has a
name,description, anduse_case. It represents the logical structure of a report type (e.g. "Accident Report" or "End-of-Month Financial Summary").Attach Templates — One or more PDF templates are linked via
add_template_to_schema. This automatically creates aSchemaFieldstub for every PDF field in the template, pre-populated with the raw field name and source template reference.Configure Field Metadata — Each
SchemaFieldis configured viaupdate_schema_field. Users set thedescription(to guide the LLM),data_type,word_limit,requiredflag, andallowed_values. Metadata is schema-scoped: the same PDF field in two different schemas can carry different constraints.Canonization — Fields that represent the same logical concept across templates are assigned a shared
canonical_name, forming the unified vocabulary the LLM extracts against.Mapping Persistence —
update_template_mappingbuilds afield_mappingJSON object (canonical_name → PDF field name(s)) and persists it on the junction record, ready for use at fill time.Multi-Template Fill — The stored mappings route extracted values to the correct fields in each template variant independently, producing consistent output from a single extraction pass.
Key Advantages
Changes
api/db/models.py
ReportSchema— New table:name(unique),description,use_case, timestamp.SchemaField— New table: per-field metadata (description,data_type,word_limit,required,allowed_values,canonical_name) scoped to a schema and source template.ReportSchemaTemplate— New junction table linking schemas to templates with afield_mappingJSON column. AUniqueConstrainton(template_id, report_schema_id)prevents duplicate associations.Datatypeenum —string,int,date,enum— used to type-annotate fields and drive validation.Template.name— Addedunique=Trueto prevent duplicate template registrations.api/db/repositories.py
get_template,update_template,delete_template(with cascade to junction records).get_form,update_form,delete_form.create / get / list / update / delete. Delete cascades throughSchemaFields and junctions.add_template_to_schema— Registers the association and auto-createsSchemaFieldstubs for all PDF fields in the template.remove_template_from_schema— Removes the junction and allSchemaFields originating from that template.get_schema_fields/get_schema_field/update_schema_field— Schema-scoped field metadata management;update_schema_fieldvalidates field ownership before applying changes.update_template_mapping— Groups fields bycanonical_nameand persists the resulting mapping on the junction row.get_field_mapping— Returns the stored canonical → PDF field mapping for a schema–template pair.api/schemas/report_class.py (new)
Pydantic request/response models for
ReportSchema,SchemaField, andReportSchemaTemplate, ready for API route handlers.api/db/database.py & api/db/init_db.py
Minor updates to register the new models with SQLModel metadata so tables are created on startup.
tests/unit/test_repositories.py
Full unit test coverage for all new repository functions: creation, retrieval, update, deletion, cascade behaviour, schema-scoped field ownership enforcement, and duplicate junction prevention.
Testing
Related Issues
Closes / related to: #102, #111, #152, #196, #206, #255 , .