Skip to content

Building a Character Recognition Pipeline with a frontend interface #33

@fyang3

Description

@fyang3

The character class in the gender_analysis toolkit provides the functionality to automatically generate a character list with each character’s name, nicknames, and pronouns based on a particular document input and intake user feedback for a manually disambiguated list. The pipeline utilizes a human-AI collaboration approach that includes NLTK’s Named Entity Recognition (NER) and Neuralcoref’s Coreference Resolution model as well as a manual disambiguation interface. For the gender analysis web interface, we’d like to build a frontend that achieves the core functionality of the pipeline:

MVP:

  • A user selects a document through leveraging our document model
  • The backend pipeline automatically output a list of character names with their associated nicknames and pronoun probabilities based on THIS_NOTEBOOK
  • A frontend disambiguation interface that enables the user to validate and correct the pipeline outputs through a dropdown list design (or similar)

Nice-to-have:

  • Output a resolved text with the results from the character identification-disambiguation pipeline
  • Take the resolved text for further analysis similar to proximity analysis and frequency analysis

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions