wer_calc

Python script that performs word error rate (WER) calculation for a set of reference and generated TXT, VTT, or SRT files and outputs results to csv.

Usage

python wer_calc.py [path/to/reference-directory] [path/to/generated-directory] [path/to/output.csv]

Requirements

Create a CSV file ("output.csv") where WER will be written to, with the following headers: "Reference", "Generated", "WER". Include the names of each reference file in the "Reference" column, and the name of its corresponding generated file in the "Generated" column. For example:

Reference	Generated	WER
reference1.srt	generated1.srt
reference2.txt	generated2.txt

The script matches reference files with generated files by looking up the pairs of filenames in each row. The corresponding WER will be written to the "WER" column in the same row.

Files can be in TXT (".txt"), SRT (".srt"), or VTT (".vtt") format.

Before running this script, install werpy: pip install werpy or pip3 install werpy

This script relies on werpy to do the following:

preprocess/normalize input text to remove punctuation, remove duplicated spaces, leading/trailing blanks and convert all words to lowercase
calculate word error rate (WER) for each of the reference and hypothesis texts

Licensing

This script is created with an MIT license.

werpy is released under the terms of the BSD 3-Clause License. Please refer to its LICENSE file for full details.

werpy also includes third-party packages distributed under the BSD-3-Clause license (NumPy, Pandas) and the Apache License 2.0 (Cython). The full NumPy, Pandas and Cython licenses can be found in the werpy LICENSES directory. They can also be found directly in the following source codes:

NumPy - https://github.com/numpy/numpy/blob/main/LICENSE.txt
Pandas - https://github.com/pandas-dev/pandas/blob/main/LICENSE
Cython - https://github.com/cython/cython/blob/master/LICENSE.txt

Conversion from SRT is adapted from srt2text. Please refer to its LICENSE for full details.

Misc

Feedback, comments, suggestions, etc are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
wer_calc.py		wer_calc.py
wer_results.csv		wer_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wer_calc

Usage

Requirements

Licensing

Misc

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

wer_calc

Usage

Requirements

Licensing

Misc

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages