Skip to content
This repository was archived by the owner on Nov 18, 2021. It is now read-only.
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions cases/corpus-17th-century-newspapers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Corpus 17th century transcribed newspaper articles

## Metadata

* **Status:** In Progress
* **Type:** Specific
* **Work Package**: WP6
* **Research Coordinators:** Nicoline van der Sijs (INT)
* **Coordinators for CLARIAH:** Joris van Zundert (Huygens ING)
* **Participating Institutes:** INT, KB, HuC (Huygens ING, HuC-DI)
* **End-users**: scholars and broader public interested in historical newspapers, CLARIAH Plus communication team
* **Developers**: developers at INT and HuC-DI
* **Interest Groups**: Cur, UI(, TP)
* **Task IDs**:

## Description

(to be translated to English)
In deze use case wordt een webapplicatie/site geleverd, waarop een gebruiker kan zoeken in de data en metadata van het 17e eeuwse krantencorpus in Delpher. Voor de showcase wordt er aan de zoekresultaten een kaartcomponent toegevoegd. Op de kaart worden de geografische locaties gemarkeerd waar de gevonden berichten vandaan komen. (Merk op: dit zijn niet noodzakelijkerwijs de plaatsen waarop de berichtgeving betrekking heeft, meestal zijn het de plaatsen waar de correspondent verbleef)

### What is the research about?

This historically very interesting corpus will be made available for a wide range of research questions, for a wide audience.

### What problem is hindering the research?

The problem is low quality of the OCR of the current version of the 17th century newspapers as it is available in Delpher. For a part of the collection this quality is already improved a lot by a crowdsourcing project by Nicoline van der Sijs.

### What is needed to do the research?

1. continue improving the quality of the collection data, a.o. by applying Transkribus and further (meta)data curation
2. build a state-of-the-art corpus exploitation web application and web services.

#### Data

Subset of Delpher, partly manually transcribed in a previous crowdsourcing project.

### What software and services are involved?

amongst others:

- Docere: this is a generic visualisation platform for text editions, developed by HuC-DI. It is configurable because it uses generic GUI components (using React) that can be associated with XML elements in any XML format, and extensible, because custom components can be added.

## References

References to related resources and publications and especially links to related use-cases:

* [CLARIAH](https://clariah.nl)