Releases: Koziev/Translations
Releases · Koziev/Translations
Literary Prose Parallel Corpus
Release v1.1: Literary Prose Translation Pairs
This initial release contains 42,621 parallel text pairs for literary prose, featuring:
-
Language Coverage:
- Source texts in 11 languages (primarily English/German)
- Russian translations for all pairs
-
Content Characteristics:
- Paragraph-level aligned literary passages
- Average source text length: 408 characters
- Domain-tagged as "prose" for filtering
Example pair structure:
{
"left_text": "...original passage...",
"left_language": "en",
"right_text": "...Russian translation...",
"domain": "prose"
}Literary Prose Parallel Corpus
Release v1.0: Literary Prose Translation Pairs
This initial release contains 42,621 parallel text pairs for literary prose, featuring:
-
Language Coverage:
- Source texts in 11 languages (primarily English/German)
- Russian translations for all pairs
-
Content Characteristics:
- Paragraph-level aligned literary passages
- Average source text length: 408 characters
Example pair structure:
{
"left_text": "...original passage...",
"left_language": "English",
"right_text": "...Russian translation...",
"domain": "prose"
}