Skip to content

AU test returns significant p-value when log-likelihood difference is zero or near-zero (rare bug) #138

@bqminh

Description

@bqminh

In some rare cases, when two trees have identical or very similar log-likelihoods, the AU test p-value is unexpectedly significant (< 0.05), although they are not meant to be. There are numerical issues with AU test for near-zero delta log-likelihoods, that p-values are not identifiable. The workaround is to check if the log-likelihood is smaller than an epsilon (say 0.001 by default) or not, and in that case, just set the p-value to 1, instead of computing it as the original AU test paper.

This solution is mentioned here: https://doi.org/10.1093/bioinformatics/btag044

"To prevent erroneous rejection of the correct tree in sets with highly similar likelihoods (Shimodaira 2002), we consider trees with a log-likelihood difference of less than 10^-3 to not be significantly different regardless of the P-value computed by the AU-test (Shimodaira’s implementation of the AU-test generated the following warnings: “small variance”, “theory does not fit well”, and “regression degenerated”). This is in contrast to Shen et al. (2020a), who do not apply this (arbitrary) threshold. Without this filtering step, 38% (1186) of our diverging datasets yield significantly different trees (AU-test, P < .05)."

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions