Implement LiteraryFormPack: mapping, LCSH suffix extraction, conflict resolution#4
Open
modi02 wants to merge 5 commits into
Open
Implement LiteraryFormPack: mapping, LCSH suffix extraction, conflict resolution#4modi02 wants to merge 5 commits into
modi02 wants to merge 5 commits into
Conversation
…solution Replaces the PrefixRule-only skeleton with a fully working implementation: - MappingRule against literary_form.json (49 mappings covering Fiction and Nonfiction variants including general fiction, short story, fictitious works, biographical, autobiographical) - LCSHSuffixRule: extracts genre signals from LCSH '--' patterns e.g. 'Pirates--Fiction' correctly resolves to Fiction - DroppableRule: silently drops noise strings (import artifacts, access markers) - Conflict resolution: Fiction wins by default unless strong unambiguous Nonfiction markers present (biography, memoir, autobiography etc.) Prevents topic subdivisions like 'history' from overriding fiction classification on historical fiction works - Adds resources/mappings/literary_form.json and updated droppable.json All 11 manual tests pass.
|
Great Work! |
- Extract LCSHSuffixRule into rules/lcsh_suffix_rule.py - Export from rules/__init__.py alongside other rules - Import from rules in literary_form.py instead of defining inline - All 11 tests still pass
Author
|
Done, moved to rules/lcsh_suffix_rule.py and exported from rules/init.py. All tests still pass. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This builds on top of the architecture in #3.
The LiteraryFormPack in #3 only had PrefixRule("form") so actual subject strings like "fiction" or "Pirates--Fiction" weren't matching anything. This PR fills that in.
Changes:
Tested against 11 cases, all pass.
Built to be compatible with #3, happy to rebase once that merges.