Skip to content
This repository was archived by the owner on Jul 5, 2018. It is now read-only.
This repository was archived by the owner on Jul 5, 2018. It is now read-only.

Sentence segmentation with apostrophes #1

@fbaumgardt

Description

@fbaumgardt

Apostrophes ʼ are not parsed correctly - sometimes they appear in pairs to mark quotations. The second apostrophe usually gets assigned to the following sentence and if there is none (-> end of chapter), it will be assigned its own sentence with length=1. You can find those locations searching for "1".*\n\s{3}</.

I am not familiar with the sentence id schema here - how can we fix a bug that affects sentence segmentation?

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions