fix: ONT amplicon pipeline uses dedicated ont_amplicon_params config (#79)#94
Conversation
…79) The ONT amplicon pipeline was reading model_file from pacbio_params, which defaults to ERRHMM-SEQUEL.model (a PacBio Sequel model). The v0.43.5 "fix" only added a runtime warning but still used the wrong model. Changes: - Add ont_amplicon_params config section with ERRHMM-ONT.model default - Add OntAmpliconConfig TypedDict for type-safe parameter access - ONT pipeline reads from ont_amplicon_params (not pacbio_params) - CLI routes --model-file to ont_amplicon_params for --platform ont - Add _apply_ont_amplicon_params() function in CLI - Config schema validation and path resolution for ont_amplicon_params - Update docs with ONT configuration section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
- Log when default ONT model is applied (visibility for missing config) - Consistent threads default: 8 in code, config, and docs (was 4 in code/docs) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes ONT amplicon simulation incorrectly using PacBio error-model configuration by introducing a dedicated ont_amplicon_params config section and routing ONT-specific CLI/pipeline behavior through it.
Changes:
- Added
ont_amplicon_paramsto the config schema/default config, plus a newOntAmpliconConfigTypedDict. - Updated the ONT amplicon pipeline to read model settings from
ont_amplicon_params(with ONT defaults) instead ofpacbio_params. - Updated CLI and tests/docs so
--platform ontpopulates/usesont_amplicon_params.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/read_simulator/test_ont_amplicon_pipeline.py | Updates fixtures/tests to use ont_amplicon_params instead of pacbio_params. |
| tests/cli/test_amplicon_platform.py | Adds CLI test asserting ONT platform populates ont_amplicon_params. |
| muc_one_up/type_defs.py | Adds OntAmpliconConfig TypedDict for ONT amplicon params. |
| muc_one_up/read_simulator/ont_amplicon_pipeline.py | Switches ONT pipeline parameter source to ont_amplicon_params and adjusts coverage/model handling. |
| muc_one_up/config.py | Adds JSON schema + path resolution support for ont_amplicon_params.model_file. |
| muc_one_up/cli/commands/reads.py | Routes --model-file/--model-type/--seed overrides to ont_amplicon_params when --platform ont. |
| docs/guides/amplicon-simulation.md | Documents the new ONT config section and defaults. |
| docs/about/changelog.md | Adds changelog entries for the fix/new config section. |
| config.json | Adds default ont_amplicon_params section with ONT model defaults. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| model_type = ont_params.get("model_type", "errhmm") | ||
| model_file = ont_params.get("model_file", "reference/pbsim3/ERRHMM-ONT.model") | ||
| threads = ont_params.get("threads", 8) | ||
| seed = ont_params.get("seed") | ||
| accuracy_mean = ont_params.get("accuracy_mean", 0.95) |
There was a problem hiding this comment.
ont_params.get(...) will return None if the key exists but is explicitly set to null (possible when using load_config_raw, which skips schema validation). That would later break Path(model_file) and also pass None into run_pbsim3_template_simulation where accuracy_mean is formatted as a float. Consider normalizing here (e.g., treat None as missing and fall back to defaults) or raise a clear error if model_type/model_file/accuracy_mean are present but null/invalid.
| # Default to ERRHMM-ONT.model if no model configured | ||
| if "model_file" not in params: | ||
| params["model_file"] = "reference/pbsim3/ERRHMM-ONT.model" | ||
| logging.info("Using default ONT model: %s", params["model_file"]) | ||
| if "model_type" not in params: |
There was a problem hiding this comment.
The defaults here only apply when the key is absent, not when it is present-but-null (e.g., {"ont_amplicon_params": {"model_file": null}} from a partial config loaded via load_config_raw). In that case the pipeline will later crash when it tries to use None as a path/string. Consider treating None the same as missing (or validating and raising a click.ClickException), similar to how _apply_pacbio_params validates required params.
| # Default to ERRHMM-ONT.model if no model configured | |
| if "model_file" not in params: | |
| params["model_file"] = "reference/pbsim3/ERRHMM-ONT.model" | |
| logging.info("Using default ONT model: %s", params["model_file"]) | |
| if "model_type" not in params: | |
| # Default to ERRHMM-ONT.model if no model configured. Treat explicit | |
| # nulls from partially loaded config the same as missing values. | |
| if params.get("model_file") is None: | |
| params["model_file"] = "reference/pbsim3/ERRHMM-ONT.model" | |
| logging.info("Using default ONT model: %s", params["model_file"]) | |
| if params.get("model_type") is None: |
|
|
||
| Raises: | ||
| click.ClickException: If required params are missing after merge. |
There was a problem hiding this comment.
This docstring says _apply_ont_amplicon_params raises click.ClickException when required params are missing, but the function currently never raises (it silently fills defaults). Either remove/update the "Raises" section, or add explicit validation for invalid/None values so callers get a consistent, actionable CLI error (matching _apply_pacbio_params behavior).
| Raises: | |
| click.ClickException: If required params are missing after merge. | |
| Missing model settings are populated with defaults if not provided. |
| "threads": {"type": "number", "minimum": 1}, | ||
| "seed": {"type": ["number", "null"]}, | ||
| "accuracy_mean": { | ||
| "type": ["number", "null"], |
There was a problem hiding this comment.
accuracy_mean is allowed to be null by the schema, but the ONT amplicon pipeline passes this value directly into run_pbsim3_template_simulation, which expects a float and formats it with %.2f. Either disallow null here (use just "type": "number") or ensure the pipeline coerces null to the default before invoking pbsim3.
| "type": ["number", "null"], | |
| "type": "number", |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Treat None as missing in ONT pipeline (use `or` instead of default arg) - Fix docstring: remove incorrect Raises section, document fallback behavior - CLI: use `params.get()` truthy check to handle explicit null values - Schema: disallow null for accuracy_mean (must be a number) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
The ONT amplicon pipeline was reading
model_filefrompacbio_params, which defaults toERRHMM-SEQUEL.model(a PacBio Sequel error model). ONT reads were being simulated with PacBio error profiles. The v0.43.5 "fix" only added a runtime warning but still used the wrong model.Root cause
No
ont_amplicon_paramsconfig section existed. The ONT pipeline sharedpacbio_paramswith PacBio, and the CLI wrote--model-fileoverrides topacbio_paramsfor both platforms.Fix
ont_amplicon_paramsconfig section withERRHMM-ONT.modelas defaultOntAmpliconConfigTypedDict intype_defs.pyont_amplicon_paramsinstead ofpacbio_params--model-filetoont_amplicon_paramsfor--platform ont,pacbio_paramsfor PacBioont_amplicon_paramsConfig change
Test plan
ont_amplicon_params(notpacbio_params)ont_amplicon_paramsCloses #79
🤖 Generated with Claude Code