-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Milestone
Description
Frog now assigns provenance data to FoLiA, which a.o. allows us to detect a rerun of (parts of) Frog on a FoLiA documents. BUT:
Handling this is quite dangerous and needs a lot of thinking.
- assigning useful ID's to all provenance information of the several tools
- Does a rerun of one or more parts (like MBLEM or NER) mean an extra sub-processor under the old frog processor OR do we add a new Frog-processor?
- tokenization is a special case see Rerunning ucto on already tokenized FoLiA ucto#68
- etc
As I don't want to postpone the FoLiA 2.0 Release, I suggest for the time being to just FORBID running Frog again on FoLiA with frog provenance data. That will not break existing cases, and will for sure NOT introduce artifacts that would bother us in the future.
@proycon Any comments?
Metadata
Metadata
Labels
No labels