Skip to content

fix: ONT amplicon pipeline uses dedicated ont_amplicon_params config (#79)#94

Merged
berntpopp merged 4 commits into
mainfrom
fix/79-ont-model-config
Apr 7, 2026
Merged

fix: ONT amplicon pipeline uses dedicated ont_amplicon_params config (#79)#94
berntpopp merged 4 commits into
mainfrom
fix/79-ont-model-config

Conversation

@berntpopp
Copy link
Copy Markdown
Owner

Summary

The ONT amplicon pipeline was reading model_file from pacbio_params, which defaults to ERRHMM-SEQUEL.model (a PacBio Sequel error model). ONT reads were being simulated with PacBio error profiles. The v0.43.5 "fix" only added a runtime warning but still used the wrong model.

Root cause

No ont_amplicon_params config section existed. The ONT pipeline shared pacbio_params with PacBio, and the CLI wrote --model-file overrides to pacbio_params for both platforms.

Fix

  • Add ont_amplicon_params config section with ERRHMM-ONT.model as default
  • Add OntAmpliconConfig TypedDict in type_defs.py
  • ONT pipeline reads from ont_amplicon_params instead of pacbio_params
  • CLI routes --model-file to ont_amplicon_params for --platform ont, pacbio_params for PacBio
  • Config schema validation and path resolution for ont_amplicon_params

Config change

"ont_amplicon_params": {
    "model_type": "errhmm",
    "model_file": "reference/pbsim3/ERRHMM-ONT.model",
    "threads": 8,
    "accuracy_mean": 0.95
}

Test plan

  • 1,431 tests pass (89% coverage)
  • ruff clean, mypy clean
  • New test: ONT platform populates ont_amplicon_params (not pacbio_params)
  • Existing ONT pipeline tests updated to use ont_amplicon_params
  • Documentation updated (amplicon simulation guide, changelog)
  • CI passes

Closes #79

🤖 Generated with Claude Code

…79)

The ONT amplicon pipeline was reading model_file from pacbio_params,
which defaults to ERRHMM-SEQUEL.model (a PacBio Sequel model). The
v0.43.5 "fix" only added a runtime warning but still used the wrong
model.

Changes:
- Add ont_amplicon_params config section with ERRHMM-ONT.model default
- Add OntAmpliconConfig TypedDict for type-safe parameter access
- ONT pipeline reads from ont_amplicon_params (not pacbio_params)
- CLI routes --model-file to ont_amplicon_params for --platform ont
- Add _apply_ont_amplicon_params() function in CLI
- Config schema validation and path resolution for ont_amplicon_params
- Update docs with ONT configuration section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 69.69697% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
muc_one_up/cli/commands/reads.py 47.36% 4 Missing and 6 partials ⚠️

📢 Thoughts on this report? Let us know!

- Log when default ONT model is applied (visibility for missing config)
- Consistent threads default: 8 in code, config, and docs (was 4 in code/docs)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes ONT amplicon simulation incorrectly using PacBio error-model configuration by introducing a dedicated ont_amplicon_params config section and routing ONT-specific CLI/pipeline behavior through it.

Changes:

  • Added ont_amplicon_params to the config schema/default config, plus a new OntAmpliconConfig TypedDict.
  • Updated the ONT amplicon pipeline to read model settings from ont_amplicon_params (with ONT defaults) instead of pacbio_params.
  • Updated CLI and tests/docs so --platform ont populates/uses ont_amplicon_params.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/read_simulator/test_ont_amplicon_pipeline.py Updates fixtures/tests to use ont_amplicon_params instead of pacbio_params.
tests/cli/test_amplicon_platform.py Adds CLI test asserting ONT platform populates ont_amplicon_params.
muc_one_up/type_defs.py Adds OntAmpliconConfig TypedDict for ONT amplicon params.
muc_one_up/read_simulator/ont_amplicon_pipeline.py Switches ONT pipeline parameter source to ont_amplicon_params and adjusts coverage/model handling.
muc_one_up/config.py Adds JSON schema + path resolution support for ont_amplicon_params.model_file.
muc_one_up/cli/commands/reads.py Routes --model-file/--model-type/--seed overrides to ont_amplicon_params when --platform ont.
docs/guides/amplicon-simulation.md Documents the new ONT config section and defaults.
docs/about/changelog.md Adds changelog entries for the fix/new config section.
config.json Adds default ont_amplicon_params section with ONT model defaults.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +85 to +89
model_type = ont_params.get("model_type", "errhmm")
model_file = ont_params.get("model_file", "reference/pbsim3/ERRHMM-ONT.model")
threads = ont_params.get("threads", 8)
seed = ont_params.get("seed")
accuracy_mean = ont_params.get("accuracy_mean", 0.95)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ont_params.get(...) will return None if the key exists but is explicitly set to null (possible when using load_config_raw, which skips schema validation). That would later break Path(model_file) and also pass None into run_pbsim3_template_simulation where accuracy_mean is formatted as a float. Consider normalizing here (e.g., treat None as missing and fall back to defaults) or raise a clear error if model_type/model_file/accuracy_mean are present but null/invalid.

Copilot uses AI. Check for mistakes.
Comment thread muc_one_up/cli/commands/reads.py Outdated
Comment on lines +165 to +169
# Default to ERRHMM-ONT.model if no model configured
if "model_file" not in params:
params["model_file"] = "reference/pbsim3/ERRHMM-ONT.model"
logging.info("Using default ONT model: %s", params["model_file"])
if "model_type" not in params:
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaults here only apply when the key is absent, not when it is present-but-null (e.g., {"ont_amplicon_params": {"model_file": null}} from a partial config loaded via load_config_raw). In that case the pipeline will later crash when it tries to use None as a path/string. Consider treating None the same as missing (or validating and raising a click.ClickException), similar to how _apply_pacbio_params validates required params.

Suggested change
# Default to ERRHMM-ONT.model if no model configured
if "model_file" not in params:
params["model_file"] = "reference/pbsim3/ERRHMM-ONT.model"
logging.info("Using default ONT model: %s", params["model_file"])
if "model_type" not in params:
# Default to ERRHMM-ONT.model if no model configured. Treat explicit
# nulls from partially loaded config the same as missing values.
if params.get("model_file") is None:
params["model_file"] = "reference/pbsim3/ERRHMM-ONT.model"
logging.info("Using default ONT model: %s", params["model_file"])
if params.get("model_type") is None:

Copilot uses AI. Check for mistakes.
Comment thread muc_one_up/cli/commands/reads.py Outdated
Comment on lines +149 to +151

Raises:
click.ClickException: If required params are missing after merge.
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring says _apply_ont_amplicon_params raises click.ClickException when required params are missing, but the function currently never raises (it silently fills defaults). Either remove/update the "Raises" section, or add explicit validation for invalid/None values so callers get a consistent, actionable CLI error (matching _apply_pacbio_params behavior).

Suggested change
Raises:
click.ClickException: If required params are missing after merge.
Missing model settings are populated with defaults if not provided.

Copilot uses AI. Check for mistakes.
Comment thread muc_one_up/config.py Outdated
"threads": {"type": "number", "minimum": 1},
"seed": {"type": ["number", "null"]},
"accuracy_mean": {
"type": ["number", "null"],
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accuracy_mean is allowed to be null by the schema, but the ONT amplicon pipeline passes this value directly into run_pbsim3_template_simulation, which expects a float and formats it with %.2f. Either disallow null here (use just "type": "number") or ensure the pipeline coerces null to the default before invoking pbsim3.

Suggested change
"type": ["number", "null"],
"type": "number",

Copilot uses AI. Check for mistakes.
berntpopp and others added 2 commits April 7, 2026 14:36
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Treat None as missing in ONT pipeline (use `or` instead of default arg)
- Fix docstring: remove incorrect Raises section, document fallback behavior
- CLI: use `params.get()` truthy check to handle explicit null values
- Schema: disallow null for accuracy_mean (must be a number)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@berntpopp berntpopp merged commit cab39ad into main Apr 7, 2026
6 checks passed
@berntpopp berntpopp deleted the fix/79-ont-model-config branch April 7, 2026 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ONT amplicon pipeline uses PacBio model file instead of ONT model

3 participants