-
Notifications
You must be signed in to change notification settings - Fork 110
Open
Description
url="https://kaikki.org/dictionary/raw-wiktextract-data.jsonl.gz"
output_file="20260201_raw-wiktextract-data.jsonl"
curl -fL "$url" \
| gzip -dc \
> "$output_file"I currently download the compressed dump like that.
Currently it is based on the 20260201 dump of wiktionary but there is no way to know what you get based on the filename.
I also didn't find any embedded data that tells me what version of wiktionary the dump is based on.
Maybe add a first line of json with metadata or simply offer a index with maybe the last 3 extracts (compressed would be enough) with proper filenames.
Currently I verify the sha256 sum to check that I got the file that I want.
Related to: #1264
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels