Skip to content

Rerunning frog on already frogged FoliA #70

@kosloot

Description

@kosloot

Frog now assigns provenance data to FoLiA, which a.o. allows us to detect a rerun of (parts of) Frog on a FoLiA documents. BUT:
Handling this is quite dangerous and needs a lot of thinking.

  • assigning useful ID's to all provenance information of the several tools
  • Does a rerun of one or more parts (like MBLEM or NER) mean an extra sub-processor under the old frog processor OR do we add a new Frog-processor?
  • tokenization is a special case see Rerunning ucto on already tokenized FoLiA ucto#68
  • etc
    As I don't want to postpone the FoLiA 2.0 Release, I suggest for the time being to just FORBID running Frog again on FoLiA with frog provenance data. That will not break existing cases, and will for sure NOT introduce artifacts that would bother us in the future.

@proycon Any comments?

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions