-
Notifications
You must be signed in to change notification settings - Fork 19
API Documentation

Contains all arguments used in the program. You can set them here or provide them as command line arguments.
- bleu_smoothing: Smoothing method to be used for BLEU calculation.
- t: t value for confidence interval calculation.
- train_source: Path to the train source file, where each line corresponds to one train input.
- test_source: Path to the test source file, where each line corresponds to one test input.
- test_target: Path to the test target file, where each line corresponds to one test target.
- text_vocab: A file where each line is a word in the vocab.
- vector_vocab: A file where each line is a word in the vocab followed by a vector.
- test_responses: Path to the test model responses file, or to a directory containing different test response files.
- metrics: A dict, where the keys are the 17 metrics, and the values are either 0 or 1, depending on if you want the specific metric to be computed.
Main class which computes all the metrics and saves them to the output file.
- project_path: Path to the current file.
- test_responses: Path to test responses file or directory.
- config: A Config instance.
- distro: The train data distribution of words / bigrams.
- vocab: Vocab containing word vectors.
- input_dir: The directory containing responses file.
- output_path: Path to the output file.
- which_metrics: A dict, where each metric has either 0 or 1 value depending on if it will be calculated.
- metrics: A dict containing filenames and for each filename a dict containing the list of metrics (for each example).
- train_source: Path to the train source file, where each line corresponds to one train input.
- test_source: Path to the test source file, where each line corresponds to one test input.
- test_target: Path to the test target file, where each line corresponds to one test target.
- text_vocab: A file where each line is a word in the vocab.
- vector_vocab: A file where each line is a word in the vocab followed by a vector.
- objects: A dict containing the instances of the other Metrics classes, keyed by their name.
Initialize the Metrics objects and other variables based on the Config instance.
Whether there is at least one metric in the family of metrics provided by the parameter. returns: A bool representing this.
Downloads fastText word embeddings.
Builds a vocab based on the training data file.
Generate the fast text embeddings for the vocab. Also generate the vocab if necessary.
Set to 0 a list of metrics, so they won't be computed.
Load the vocab from file.
Main loop to compute metrics for all files.
Compute mean, standard deviance, and confidence interval of metrics and save them to output file.
A class to handle embedding-average, -extrema, -greedy as described here.
- vocab: Dict containing the vocab and word vectors.
- emb_dim: Embedding dimension of word vectors.
- distro: Training data distribution of words.
- average: Whether to compute embedding-average.
- metrics: Dict containing the metric lists.
Initialize the provided parameters.
Compute the metrics for the provided example sentence (as word list).
Compute the average word embedding. return: An np.array representation
Compute the extrema embedding. return: An np.array representation.
Compute the greedy score from one side. returns: A float, the greedy score.
Handles the computation of coherence.
Handles the computation of the KL divergence.
- vocab: A dict containing word vectors of the vocab.
- gt_path: Path to the ground truth file.
- metrics: A dict containing the metric lists.
Initialize the given parameters.
Compute the metrics for the provided example sentence (as word list).
Setup the ground truth and test distributions from the given filename.
Only keep the intersection of the two dictionaries. returns: The filtered dictionaries.
Handles the computation of BLEU score.
- metrics: Dict containing the metric lists.
- smoothing: The smoothing method from nltk.
Initialize the smoothing function.
Compute the metrics for the provided example sentence (as word list).
Handles the computation of entropy metrics.
- vocab: Dict containing the word vectors of the vocab.
- distro: Dict containing the training data word distribution.
- metrics: Dict containing the metric lists.
Initialize the given parameters.
Compute the metrics for the provided example sentence (as word list).
Handles the computation of distinct-1 and distinct-2
- vocab: Dict containing word vectors of the vocab.
- metrics: Dict containing the metric lists.
Initialize the given parameters.
Calculate the distinct value for a distribution. returns: A float.
Calculate distinct metrics for a given file.
Ghost function.