Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# WikiLingua: A Multilingual Abstractive Summarization Dataset #

**UPDATE:\
We have created new Train/Test splits for all 17 languages that can be downloaded [here](https://drive.google.com/file/d/1sTCB5NDPq6vUOlxR29DbvSssErvXLD1d/view?usp=sharing). These splits were created to ensure that there is no (document, summary) pair overlap across any of the 18 languages so that they can be safely used for multilingual evaluations.**
We have created new Train/Test splits for all 17 languages that can be downloaded [here](https://huggingface.co/datasets/esdurmus/wiki_lingua). These splits were created to ensure that there is no (document, summary) pair overlap across any of the 18 languages so that they can be safely used for multilingual evaluations.**

This repo contains dataset introduced in the following paper:

[WikiLingua: A New Benchmark Dataset for Multilingual Abstractive
Summarization](https://arxiv.org/abs/2010.03093)

Download the dataset using [this link](https://drive.google.com/file/d/1sTCB5NDPq6vUOlxR29DbvSssErvXLD1d/view?usp=sharing).
Download the dataset using [this link](https://huggingface.co/datasets/esdurmus/wiki_lingua).

## Reference ##
Please cite the following paper:
Expand Down