Skip to content
This repository was archived by the owner on Feb 1, 2025. It is now read-only.
This repository was archived by the owner on Feb 1, 2025. It is now read-only.

AdLinke, paperwork #70

@Adlinke

Description

@Adlinke

Hi,

I was trying to reproduce results by running your code, and couldn't get exactly the same precision on SQuAD.
Here is what I got for bert_large model on SQuAD:
all_samples: 303
list_of_results: 303
global MRR: 0.3018861233236291
global Precision at 10: 0.5676567656765676
global Precision at 1: 0.16831683168316833

However, in the paper, the table shows that there should be 305 samples and the precision should be 17.4%.

At first, I guessed that it is because 2 samples are excluded because their object labels are out of the common vocabulary, but even after testing without common vocabulary, I got global Precision at 1: 0.1704918, which is still different to results in the paper.

Is there a way to reproduce the same results in the paper?
Please correct me if I made any mistakes! Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions