Skip to content

Assertion error in split_disjoint_nodes #497

@hyanwong

Description

@hyanwong

There's something odd going on with the logic in split_disjoint_nodes. In the following example I get an assertion error:

in split_disjoint_nodes(ts, record_provenance)
    583 tables.build_index()
    584 tables.compute_mutation_parents()
--> 586 assert np.array_equal(
    587     tables.nodes.time[tables.mutations.node], ts.nodes_time[ts.mutations_node]
    588 )

Here's the code to reproduce

import numpy as np
import stdpopsim
import tsinfer
import tsdate

species = stdpopsim.get_species('HomSap')
model = species.get_demographic_model('PapuansOutOfAfrica_10J19')
contig = species.get_contig('1', right=1000000, mutation_rate=model.mutation_rate)

engine = stdpopsim.get_engine('msprime')
samples = {
    "YRI": 20,
    "DenA": 2,
    "NeaA": 2
}

sim_ts = engine.simulate(model, contig, samples=samples, seed=123)
sd = tsinfer.SampleData.from_tree_sequence(ts)
sample_times = ts.nodes_time[ts.samples()]

ancestors = tsinfer.generate_ancestors(sd, progress_monitor=True)
a_ts = tsinfer.match_ancestors(sd, ancestors, progress_monitor=True)
inferred_ts = tsinfer.match_samples(
    sd,
    a_ts,
    indexes = np.where(sample_times == 0)[0],
    progress_monitor=True,
)
tsdate.preprocess_ts(inferred_ts)  # Assertion error

Have you any idea what's going on here @nspope ? I dug a bit but got confused by the flow of logic. It appears as if there are swaps in the node assignments, e.g.

Node 29 has time 0.0, but mutation 509 is assigned to node 885 with time 0.08333333333333333
Node 885 has time 0.08333333333333333, but mutation 510 is assigned to node 29 with time 0.0

It can be avoided by fully simplifying first:

tsdate.preprocess_ts(inferred_ts.simplify())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions