diff --git a/cases/corpus-17th-century-newspapers.md b/cases/corpus-17th-century-newspapers.md new file mode 100644 index 0000000..14dbcef --- /dev/null +++ b/cases/corpus-17th-century-newspapers.md @@ -0,0 +1,49 @@ +# Corpus 17th century transcribed newspaper articles + +## Metadata + +* **Status:** In Progress +* **Type:** Specific +* **Work Package**: WP6 +* **Research Coordinators:** Nicoline van der Sijs (INT) +* **Coordinators for CLARIAH:** Joris van Zundert (Huygens ING) +* **Participating Institutes:** INT, KB, HuC (Huygens ING, HuC-DI) +* **End-users**: scholars and broader public interested in historical newspapers, CLARIAH Plus communication team +* **Developers**: developers at INT and HuC-DI +* **Interest Groups**: Cur, UI(, TP) +* **Task IDs**: + +## Description + +(to be translated to English) +In deze use case wordt een webapplicatie/site geleverd, waarop een gebruiker kan zoeken in de data en metadata van het 17e eeuwse krantencorpus in Delpher. Voor de showcase wordt er aan de zoekresultaten een kaartcomponent toegevoegd. Op de kaart worden de geografische locaties gemarkeerd waar de gevonden berichten vandaan komen. (Merk op: dit zijn niet noodzakelijkerwijs de plaatsen waarop de berichtgeving betrekking heeft, meestal zijn het de plaatsen waar de correspondent verbleef) + +### What is the research about? + +This historically very interesting corpus will be made available for a wide range of research questions, for a wide audience. + +### What problem is hindering the research? + +The problem is low quality of the OCR of the current version of the 17th century newspapers as it is available in Delpher. For a part of the collection this quality is already improved a lot by a crowdsourcing project by Nicoline van der Sijs. + +### What is needed to do the research? + +1. continue improving the quality of the collection data, a.o. by applying Transkribus and further (meta)data curation +2. build a state-of-the-art corpus exploitation web application and web services. + +#### Data + +Subset of Delpher, partly manually transcribed in a previous crowdsourcing project. + +### What software and services are involved? + +amongst others: + +- Docere: this is a generic visualisation platform for text editions, developed by HuC-DI. It is configurable because it uses generic GUI components (using React) that can be associated with XML elements in any XML format, and extensible, because custom components can be added. + +## References + +References to related resources and publications and especially links to related use-cases: + +* [CLARIAH](https://clariah.nl) +