Conversation
- Implement full step-by-step validation for A3 JSON format in a new diagnostic module. - Update error types to include serialization errors. - Modify main CLI to support a new `--diagnose` option for detailed validation output. - Improve metadata display logic to only show non-empty fields. - Refactor schema and version constants to be publicly accessible.
There was a problem hiding this comment.
Code Review
This pull request updates the rtemis.a3 package to version 0.5.2, introducing a centralized verbosity-based messaging system in R and a new diagnostic validation mode in Rust. Key improvements include enhanced CLI output with color support and updated documentation. Feedback highlights several critical bugs in the R implementation, such as syntax errors from missing argument values and typos in parameter names. In the Rust code, the use of the unstable let_chains feature needs to be addressed for stable compatibility, and a potential panic in the diagnostic logic should be replaced with proper error handling.
| if let Some(len) = seq_len | ||
| && pos > len | ||
| { | ||
| errors.push(format!( | ||
| "{field}.position: {pos} is out of bounds \ | ||
| for sequence of length {len} (must be 1–{len})" | ||
| )); | ||
| } |
There was a problem hiding this comment.
The use of let_chains (combining if let with other boolean conditions using &&) is currently an unstable feature in Rust. Unless you are using a nightly compiler and have enabled the feature gate, this will not compile. It can be rewritten using nested if statements for compatibility with stable Rust.
| if let Some(len) = seq_len | |
| && pos > len | |
| { | |
| errors.push(format!( | |
| "{field}.position: {pos} is out of bounds \ | |
| for sequence of length {len} (must be 1–{len})" | |
| )); | |
| } | |
| if let Some(len) = seq_len { | |
| if pos > len { | |
| errors.push(format!( | |
| "{field}.position: {pos} is out of bounds \ | |
| for sequence of length {len} (must be 1–{len})" | |
| )); | |
| } | |
| } |
| if let Some(v) = meta.get(key) | ||
| && !v.is_string() | ||
| { |
| if let Some(v) = entry.get("type") | ||
| && !v.is_string() | ||
| { |
| // All diagnostic checks passed — use the standard fast path to build | ||
| // a validated A3. This should never fail: if it does, the diagnostic | ||
| // checks have a gap that needs fixing. | ||
| Ok(a3_from_json(text).expect("diagnostic passed but standard parse failed")) |
There was a problem hiding this comment.
Using expect() can cause a panic if a3_from_json fails. While the diagnostic checks are intended to be exhaustive, it is safer to return the error as a result in a library context.
| Ok(a3_from_json(text).expect("diagnostic passed but standard parse failed")) | |
| a3_from_json(text).map_err(|e| vec![e.to_string()]) |
There was a problem hiding this comment.
Pull request overview
Adds a Rust “diagnostic mode” validation path and overhauls CLI output formatting/styling, while also updating the R package for a CRAN-oriented release (docs/metadata/logging).
Changes:
- Introduce step-by-step diagnostic validation spec + Rust implementation and a
-D/--diagnoseCLI flag. - Revise Rust CLI human output layout (tree connectors, colors, wrapping) and expose schema/version constants for reuse.
- Update R package metadata/docs for CRAN and refactor message verbosity handling across several utilities.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| specs/diagnostic.md | New spec describing diagnostic validation steps. |
| specs/cli.md | Updates CLI output examples, defaults, and styling notes. |
| rust/src/types.rs | Exposes A3_SCHEMA_URI / A3_VERSION as public constants. |
| rust/src/main.rs | Adds --diagnose, colorized/tree output, wrapping helpers, and improved parse error reporting. |
| rust/src/lib.rs | Re-exports schema/version constants; comment/doc tweaks. |
| rust/src/error.rs | Updates top-level error docs to include serialization. |
| rust/src/diagnostic.rs | New diagnostic validator that accumulates errors by step. |
| rust/README.md | Updates Rust crate/CLI README examples and option defaults. |
| rust/Cargo.toml | Adds colored dependency for styled CLI output. |
| r/README.md | Refreshes installation/read-write JSON guidance. |
| r/R/utils_uniprot.R | Changes verbosity/message behavior in UniProt helpers. |
| r/R/utils_pdb.R | Changes verbosity/message behavior in PDB helpers. |
| r/R/utils_clinvar.R | Refactors messaging to use verbosity-aware msg() calls. |
| r/R/utils_a3.R | Refactors amino-acid substitution logging to use msg(). |
| r/R/rtemis_color_system.R | Removes large roxygen block; keeps color list definition. |
| r/R/gene2sequence.R | Adjusts example name and messaging around transcript retrieval. |
| r/R/0_init.R | Extends msg() / msg0() with verbosity parameter and adjusts examples/docs. |
| r/NEWS.md | Adds NEWS file for CRAN-style changelog. |
| r/man/rtemis.a3-package.Rd | Updates package URL/author annotations. |
| r/man/gene2sequence.Rd | Updates example variable name. |
| r/DESCRIPTION | Bumps version/date and expands description/URL; adjusts suggests. |
| r/cran-comments.md | Adds CRAN submission comments file. |
| r/.Rbuildignore | Stops ignoring NEWS so it ships with the package. |
| .gitignore | Adds/normalizes ignored agent-related files. |
Comments suppressed due to low confidence (1)
specs/cli.md:88
- The CLI now supports
-D, --diagnose(seerust/src/main.rs), but the Options list here does not document it. Add the flag (and describe how it differs from the default validation path, including exit-code behavior for parse errors vs validation errors).
## Options
- `<FILE>`: Path to the `.json` file to validate. Use `-` for stdin.
- `-l, --limit <NUMBER>`: Limit the number of sequence residues displayed (default: 20)
- `-q, --quiet`: Suppress all output; use exit code only
- `-j, --json`: Output results in JSON format
- `-h, --help`: Print help information
- `-V, --version`: Print version information
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 3. **Top-level field presence, types, and no unknown keys** [fatal per field] | ||
| Check that `sequence` (string), `annotations` (object), and `metadata` | ||
| (object) are present and of the correct type, and that no unknown top-level |
There was a problem hiding this comment.
Step 3 says annotations and metadata must be present, but the Rust A3 type treats both as optional (they default to empty via #[serde(default)]). Either update the diagnostic spec to reflect that omission is allowed, or change the implementations/schema so presence is actually required.
| 3. **Top-level field presence, types, and no unknown keys** [fatal per field] | |
| Check that `sequence` (string), `annotations` (object), and `metadata` | |
| (object) are present and of the correct type, and that no unknown top-level | |
| 3. **Top-level fields, types, and no unknown keys** [fatal per field] | |
| Check that `sequence` is present and is a string; `annotations` and | |
| `metadata` may be omitted, but if present must be objects (if omitted, | |
| they are treated as empty objects). Also check that no unknown top-level |
| On failure, errors are listed first (with tree connectors), followed by | ||
| whatever metadata and stats are available from the partial parse: | ||
|
|
||
| ``` | ||
| ✗ invalid: | ||
| - annotations.site.foo: position 999 is out of bounds for sequence of length 6 (must be 1–6) | ||
| - annotations.region: annotation name must not be empty | ||
| ✗ invalid | ||
|
|
||
| ├── annotations.site.foo: position 999 is out of bounds for sequence of length 6 (must be 1–6) | ||
| └── annotations.region: annotation name must not be empty | ||
|
|
There was a problem hiding this comment.
The failure example shows an unindented status line (✗ invalid), but the Styling section says the status line is indented like all other output, and the Rust implementation prints it with leading spaces. Update the example (or the implementation) so they match.
| | Flag | Description | | ||
| | ----------------- | ---------------------------------------------- | | ||
| | `-l, --limit <N>` | Max sequence residues to display (default: 20) | | ||
| | `-q, --quiet` | Suppress all output; use exit code only | | ||
| | `-j, --json` | Output results in JSON format | | ||
| | `-h, --help` | Print help | | ||
| | `-V, --version` | Print version | |
There was a problem hiding this comment.
The CLI options table is missing the newly added -D, --diagnose flag. Add it here (and briefly explain what diagnostic mode does) so README stays in sync with the actual CLI.
| | Flag | Description | | |
| | ----------------- | ---------------------------------------------- | | |
| | `-l, --limit <N>` | Max sequence residues to display (default: 20) | | |
| | `-q, --quiet` | Suppress all output; use exit code only | | |
| | `-j, --json` | Output results in JSON format | | |
| | `-h, --help` | Print help | | |
| | `-V, --version` | Print version | | |
| | Flag | Description | | |
| | --------------------- | ---------------------------------------------- | | |
| | `-l, --limit <N>` | Max sequence residues to display (default: 20) | | |
| | `-q, --quiet` | Suppress all output; use exit code only | | |
| | `-j, --json` | Output results in JSON format | | |
| | `-D, --diagnose` | Enable diagnostic mode for additional details | | |
| | `-h, --help` | Print help | | |
| | `-V, --version` | Print version | |
| ``` | ||
| $ a3 tau.json | ||
| ✓ valid A3 schema version 1.0.0 (https://schema.rtemis.org/a3/v1/schema.json) | ||
| UniProt ID: P10636 | ||
| Description: Microtubule-associated protein tau | ||
| Reference: | ||
| Organism: Homo sapiens | ||
| Sequence: MAEPRQEFEV... (758) | ||
| Annotations: site: 2 region: 1 ptm: 3 processing: 0 variant: 5 | ||
| ✓ valid A3 1.0.0 https://schema.rtemis.org/a3/v1/schema.json | ||
|
|
||
| Sequence MAEPRQEFEVMEDHAGTYGL… (length = 441) | ||
|
|
There was a problem hiding this comment.
The README output examples don’t match the current print_human() formatting: the implementation indents the status line and prints the version as {A3 <version>}. Either adjust the examples to match the real output, or update print_human() to match the documented output.
No description provided.