Skip to content

Fix --scfl gap detection for models with rate heterogeneity#142

Merged
bqminh merged 1 commit intomasterfrom
roblanf
Mar 31, 2026
Merged

Fix --scfl gap detection for models with rate heterogeneity#142
bqminh merged 1 commit intomasterfrom
roblanf

Conversation

@roblanf
Copy link
Copy Markdown
Member

@roblanf roblanf commented Mar 18, 2026

Summary

  • Fixes --scfl giving incorrect results with any model using multiple rate categories (+G, +R, +I+G, +I+R, mixture models)
  • The gap detection filter in computeSubtreeAncestralState() used a fixed threshold (sum > 1.0) that broke for ncat_mix > 1 and for deep subtrees with scaled partial likelihoods
  • The fix uses scale_num to distinguish real data from gaps at internal nodes, handling both normal and SAFE_NUMERIC modes

Test plan

  • Empirical validation on reporter's dataset (10 models × 4 methods)
  • Simulation study (8-taxon tree, 1,100 runs across 4 models, 5 methods, 11 concordance levels, 5 gap levels)
  • sCFL = sCFL --safe (max |diff| = 0.01)
  • Single-category models unchanged (old == new)
  • Multi-category models fixed (sN improved from ~1 to ~590 informative sites)

Validation scripts and full results: https://github.com/roblanf/iqtree3/tree/issue-137/tests/scfl_simulation

Full report of empirical and simulation test results: #137 (comment)

Fixes #137

🤖 Generated with Claude Code

The gap detection filter in computeSubtreeAncestralState() used a
fixed threshold (sum > 1.0) that broke for models with multiple rate
categories (ncat_mix > 1) and for deep subtrees with scaled partial
likelihoods. This caused --scfl to incorrectly filter nearly all
sites as gaps, producing wrong sCF values for any model using +G, +R,
+I+G, +I+R, or mixture models.

The fix uses scale_num to distinguish real data from gaps at internal
nodes: scaled sites (scale_num > 0) are always real data, while
all-gap subtrees are detected by sum > ncat_mix with no scaling.
Leaf nodes retain the original threshold. Both normal and
SAFE_NUMERIC scale_num layouts are handled.

Fixes #137

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roblanf
Copy link
Copy Markdown
Member Author

roblanf commented Mar 19, 2026

@bqminh can you take a look at this - would be great to get it into the latest binaries ASAP.

Copy link
Copy Markdown
Member

@bqminh bqminh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks a lot for finding this out!

@bqminh bqminh merged commit 499eb99 into master Mar 31, 2026
14 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

iqtree3 gives weird (incorrect?) sCF values compared to iqtree2

2 participants