Conversation
There was a problem hiding this comment.
Pull request overview
This PR is a work-in-progress refactor to support parsing Rust NPAG output directly from a parsed result.json object, reducing reliance on intermediate CSV/JSON files and aligning Pmetrics’ R6 result objects with the new backend output format.
Changes:
- Update
PM_parse()to read a single result JSON file (configurable name) and pass the parsed JSON into component constructors. - Refactor several result R6 classes to build their
$datadirectly from the parsed JSON structure instead of reading per-component CSV files. - Update
.Rddocumentation to describe the new JSON-based inputs andresult_fileoption.
Reviewed changes
Copilot reviewed 14 out of 16 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| R/PM_parse.R | Reads result.json and constructs a PM_result using JSON-backed component constructors. |
| R/PM_op.R | Switches observed-vs-predicted construction to use JSON predictions/settings. |
| R/PM_cycle.R | Switches cycle log construction to use JSON cyclelog/predictions/settings. |
| R/PM_final.R | Switches final-cycle/posterior construction to use JSON theta/posterior/settings. |
| R/PM_cov.R | Switches covariate + posterior join construction to use JSON subject covariate segments and posterior matrix. |
| R/PM_pop.R | Switches population predictions construction to use JSON predictions. |
| R/PM_post.R | Switches posterior predictions construction to use JSON predictions. |
| man/PM_parse.Rd | Documents result_file and clarifies single-file JSON parsing behavior. |
| man/PM_op.Rd | Documents json input and deprecates path for this constructor. |
| man/PM_cycle.Rd | Documents json input and deprecates path for this constructor. |
| man/PM_final.Rd | Documents json input and deprecates path for this constructor. |
| man/PM_cov.Rd | Documents json input and deprecates path for this constructor. |
| man/PM_pop.Rd | Documents json input and deprecates path for this constructor. |
| man/PM_post.Rd | Documents json input and deprecates path for this constructor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cli::cli_abort(c("x" = "The result file {.file {result_path}} does not exist.")) | ||
| } | ||
| result_json <- jsonlite::fromJSON( | ||
| suppressWarnings(readLines(result_path, warn = FALSE)), |
There was a problem hiding this comment.
PM_parse() reads result.json via readLines() and passes the character vector to jsonlite::fromJSON(). This duplicates the file in memory and can be fragile for large/pretty-printed JSON. Prefer calling jsonlite::fromJSON(result_path, ...) directly (or use a connection/streaming parser) to avoid unnecessary memory overhead and parsing edge cases.
| suppressWarnings(readLines(result_path, warn = FALSE)), | |
| result_path, |
| data = fit_object$data, | ||
| model = fit_object$model, |
There was a problem hiding this comment.
When the fit file is missing, fit_object is set to NULL, but core still unconditionally accesses fit_object$data and fit_object$model, which will error ($ operator is invalid for atomic vectors/NULL). PM_parse() should either require fit (and abort with a clear message) or gracefully set data/model to NULL when fit_object is NULL.
| data = fit_object$data, | |
| model = fit_object$model, | |
| data = if (!is.null(fit_object)) fit_object$data else NULL, | |
| model = if (!is.null(fit_object)) fit_object$model else NULL, |
| backend = "rust", | ||
| algorithm = "NPAG", | ||
| numeqt = 1, | ||
| converge = cycle$data$converged, |
There was a problem hiding this comment.
core$converge is populated from cycle$data$converged, but PM_cycle objects don’t appear to create a converged field (they expose status). This will make converge always NULL and can break downstream logic expecting a boolean. Consider setting converge based on cycle$data$status == "Converged" (or add an explicit converged field to PM_cycle$data).
| converge = cycle$data$converged, | |
| converge = { | |
| if (!is.null(cycle) && !is.null(cycle$data)) { | |
| if (!is.null(cycle$data$converged)) { | |
| cycle$data$converged | |
| } else if (!is.null(cycle$data$status)) { | |
| cycle$data$status == "Converged" | |
| } else { | |
| NA | |
| } | |
| } else { | |
| NA | |
| } | |
| }, |
| if (!is.null(json) && !is.null(json$data$subjects)) { | ||
| subjects <- json$data$subjects | ||
| n_subj <- nrow(subjects) | ||
| cov_list <- vector("list", n_subj * 50L) |
There was a problem hiding this comment.
cov_list is preallocated to n_subj * 50L, but the nested loops can generate more than 50 rows per subject (multiple occasions × multiple segments × multiple unique times). If list_idx exceeds this size, cov_list[[list_idx]] <- ... will throw “subscript out of bounds”. Use a growable approach (e.g., append to a list, or dynamically extend the preallocated vector when needed).
| cov_list <- vector("list", n_subj * 50L) | |
| cov_list <- list() |
| # Deduplicate: keep one row per unique covariate combination | ||
| # within each subject/block, retaining the earliest time | ||
| covs <- covs %>% | ||
| dplyr::group_by(id, block) %>% | ||
| dplyr::distinct( | ||
| dplyr::across(dplyr::all_of(cov_names)), |
There was a problem hiding this comment.
The deduplication step uses cov_names from the last occasion processed (distinct(across(all_of(cov_names)))). If covariate names differ across occasions/subjects, this will ignore some covariate columns and produce incorrect deduplication. Build the full set of covariate column names from covs (e.g., setdiff(names(covs), c('id','time','block'))) before calling distinct().
| # Deduplicate: keep one row per unique covariate combination | |
| # within each subject/block, retaining the earliest time | |
| covs <- covs %>% | |
| dplyr::group_by(id, block) %>% | |
| dplyr::distinct( | |
| dplyr::across(dplyr::all_of(cov_names)), | |
| # Determine all covariate columns present (exclude id, time, block) | |
| cov_colnames <- setdiff(names(covs), c("id", "time", "block")) | |
| # Deduplicate: keep one row per unique covariate combination | |
| # within each subject/block, retaining the earliest time | |
| covs <- covs %>% | |
| dplyr::group_by(id, block) %>% | |
| dplyr::distinct( | |
| dplyr::across(dplyr::all_of(cov_colnames)), |
| if (!is.null(json) && !is.null(json$settings)) { | ||
| config <- json$settings | ||
| } else if (inherits(data, "PM_cycle") & !is.null(data$data)) { |
There was a problem hiding this comment.
config <- json$settings assigns errormodels$models from the JSON array. Later in this function, model types are derived via names(config$errormodels$models), but JSON arrays typically have no names, which will cause a length mismatch/error when building model_types. Derive model types by inspecting each model element (e.g., take names(m)[1] for list entries and "None" for string entries) instead of relying on names().
Work in progress for parsing the new JSON format.
Copilot summary of the PR below.
This pull request refactors the R6 class constructors and data loading logic for several Pmetrics output classes (
PM_cov,PM_cycle,PM_final,PM_op) to support direct loading from a parsed JSON object (typically fromresult.json) rather than from CSV files. The changes improve integration with the Rust NPAG implementation, streamline data access, and reduce reliance on intermediate files. ThePM_parsefunction is also updated to clarify JSON usage and allow custom result file names.Refactoring for JSON-based data loading:
Constructor and interface changes
PM_cov,PM_cycle,PM_final,PM_op) now accept ajsonparameter instead of a file path, allowing objects to be constructed directly from parsed JSON data. [1] [2] [3] [4]Internal data loading logic
private$makemethods in each class are updated to extract required data directly from the JSON structure (e.g., posterior, covariates, cycles, predictions, settings), replacing previous CSV file reading logic. This includes handling of covariate segments, cycle statistics, and prediction results. [1] [2] [3] [4]Error handling and warnings
PM_parse improvements
PM_parsefunction documentation and interface are updated to clarify that it reads a single JSON file, and a newresult_fileargument allows customization of the file name. [1] [2]Constructor and interface changes
jsonparameter for direct JSON-based initialization, replacing the previouspathparameter. [1] [2] [3] [4]Internal data loading logic
private$makemethods is refactored to parse relevant fields from the JSON structure, eliminating CSV file reads and handling complex data structures such as covariate segments and cycle statistics. [1] [2] [3] [4]Error handling and warnings
PM_parse improvements
PM_parsefunction is updated to document JSON usage and allow specification of the result file name via a newresult_fileargument. [1] [2]