Conversation
This commit adds two related features: ## NormalizedRow API (parser/) - New struct for format-agnostic data parsing - Decouples column mapping from event creation logic - Full ADDL/II expansion support (both positive and negative directions) - Refactors pmetrics.rs to use NormalizedRow internally - Enables external tools (like vial) to reuse parsing logic without reimplementing ADDL expansion ## ResidualErrorModel (data/) - New for parametric algorithms (SAEM, FOCE) - Uses prediction-based sigma (vs observation-based in ErrorModel) - Adds and functions - Documentation clarifying ErrorModel vs ResidualErrorModel usage Both features are independent but included together to avoid merge conflicts.
* nca * wip: current version * feat: nca * clenup * chore: documentation * chore: cleanup * chore: cleanup * chore: deprecating ErrorModel in favor of AssayErrorModel, subdividing the likelihood module and deprecating linear space likelihood calculation functions * feat: the Data parsing is centraliced to NormalizedRow * feat: the ErrorModel -> AssayErrorModel * feat: validation * chore: cleanup * chore: cleanup
Co-authored-by: Markus Hovd <markushh@uio.no>
… require extra size
* chore: Rename modules and structures * Update src/error/mod.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/data/parser/mod.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Name changes --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
| Branch | without-nca |
| Testbed | mhovd-pgx |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|
| Analytical vs ODE/One-compartment IV/Analytical | 📈 view plot 🚷 view threshold | 832.28 ns(-4.10%)Baseline: 867.89 ns | 2,001.18 ns (41.59%) |
| Analytical vs ODE/One-compartment IV/ODE | 📈 view plot 🚷 view threshold | 17,837.00 ns(+0.58%)Baseline: 17,733.50 ns | 21,026.92 ns (84.83%) |
| Analytical vs ODE/One-compartment oral/Analytical | 📈 view plot 🚷 view threshold | 852.12 ns(-3.21%)Baseline: 880.37 ns | 1,779.14 ns (47.90%) |
| Analytical vs ODE/One-compartment oral/ODE | 📈 view plot 🚷 view threshold | 25,899.00 ns(+0.47%)Baseline: 25,777.50 ns | 29,643.69 ns (87.37%) |
| Analytical vs ODE/Two-compartment IV/Analytical | 📈 view plot 🚷 view threshold | 921.80 ns(-3.22%)Baseline: 952.45 ns | 1,927.75 ns (47.82%) |
| Analytical vs ODE/Two-compartment IV/ODE | 📈 view plot 🚷 view threshold | 26,329.00 ns(-0.03%)Baseline: 26,336.50 ns | 26,575.15 ns (99.07%) |
| Analytical vs ODE/Two-compartment oral/Analytical | 📈 view plot 🚷 view threshold | 976.92 ns(-3.53%)Baseline: 1,012.71 ns | 2,151.57 ns (45.41%) |
| Analytical vs ODE/Two-compartment oral/ODE | 📈 view plot 🚷 view threshold | 29,311.00 ns(+0.43%)Baseline: 29,186.50 ns | 33,148.15 ns (88.42%) |
| Conditional dose modification | 📈 view plot 🚷 view threshold | 1,235.70 ns(+1.57%)Baseline: 1,216.65 ns | 1,822.83 ns (67.79%) |
| Create large dataset (100 subjects) | 📈 view plot 🚷 view threshold | 57,646.00 ns(+3.27%)Baseline: 55,819.50 ns | 113,939.67 ns (50.59%) |
| Data expand complex (1h intervals) | 📈 view plot 🚷 view threshold | 27,632.00 ns(+1.14%)Baseline: 27,320.50 ns | 37,232.59 ns (74.21%) |
| Data expand simple (1h intervals) | 📈 view plot 🚷 view threshold | 495.96 ns(+1.49%)Baseline: 488.65 ns | 721.10 ns (68.78%) |
| Data expand with additional time | 📈 view plot 🚷 view threshold | 37,706.00 ns(-0.02%)Baseline: 37,714.50 ns | 37,984.97 ns (99.27%) |
| Filter exclude subjects | 📈 view plot 🚷 view threshold | 30,742.00 ns(-0.56%)Baseline: 30,915.00 ns | 36,419.95 ns (84.41%) |
| Filter include subjects | 📈 view plot 🚷 view threshold | 7,892.80 ns(-1.28%)Baseline: 7,994.80 ns | 11,240.49 ns (70.22%) |
| Modify all bolus doses | 📈 view plot 🚷 view threshold | 1,208.50 ns(+1.97%)Baseline: 1,185.20 ns | 1,926.62 ns (62.73%) |
| Modify all infusion doses | 📈 view plot 🚷 view threshold | 1,248.30 ns(+1.71%)Baseline: 1,227.35 ns | 1,893.99 ns (65.91%) |
| SubjectBuilder multi-occasion | 📈 view plot 🚷 view threshold | 264.94 ns(+0.60%)Baseline: 263.36 ns | 313.79 ns (84.43%) |
| SubjectBuilder simple | 📈 view plot 🚷 view threshold | 105.12 ns(+0.79%)Baseline: 104.30 ns | 130.55 ns (80.52%) |
| SubjectBuilder with covariates | 📈 view plot 🚷 view threshold | 268.25 ns(-0.54%)Baseline: 269.71 ns | 316.17 ns (84.84%) |
| one_compartment | 📈 view plot 🚷 view threshold | 19,804.00 ns(+0.70%)Baseline: 19,666.00 ns | 24,057.23 ns (82.32%) |
| one_compartment_covariates | 📈 view plot 🚷 view threshold | 26,072.00 ns(+0.57%)Baseline: 25,923.00 ns | 30,664.26 ns (85.02%) |
| readme 20 | 📈 view plot 🚷 view threshold | 309,190.00 ns(+1.68%)Baseline: 304,080.00 ns | 466,682.84 ns (66.25%) |
| two_compartment | 📈 view plot 🚷 view threshold | 21,811.00 ns(+0.28%)Baseline: 21,751.00 ns | 23,660.23 ns (92.18%) |
There was a problem hiding this comment.
Pull request overview
This PR refactors the error model and likelihood calculation system, introducing:
- Renamed types:
ErrorModel→AssayErrorModel,ErrorModels→AssayErrorModels - New
ResidualErrorModelsfor parametric algorithms (SAEM, FOCE) - New
DataRowabstraction for flexible data parsing - Reorganized likelihood calculation module with improved numerical stability
- Shift from
estimate_likelihood()toestimate_log_likelihood()for better numerical stability
Changes:
- Refactored error models to distinguish assay error (observation-based) from residual error (prediction-based)
- Introduced new DataRow parsing layer with ADDL expansion and format-agnostic design
- Reorganized likelihood module into separate submodules (distributions, prediction, subject, matrix)
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/data/row.rs |
New DataRow abstraction for parsing with ADDL/II support (NEW FILE - 876 lines) |
src/data/residual_error.rs |
New residual error models for parametric algorithms (NEW FILE - 519 lines) |
src/data/error_model.rs |
Renamed ErrorModel → AssayErrorModel with deprecation aliases |
src/simulator/likelihood/*.rs |
Refactored into modular structure (distributions, prediction, subject, matrix) |
src/data/parser/pmetrics.rs |
Simplified to use new DataRow parsing logic |
tests/*.rs |
Updated to use new API (log_likelihood, AssayErrorModel) |
examples/one_compartment.rs |
Updated to use new API |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| /// Get a mutable reference to the compartment number that receives the bolus | ||
| /// Get a mutable reference to the compartment number (1-indexed) that receives the bolus |
There was a problem hiding this comment.
The comment on line 135 incorrectly states "Get a mutable reference to the compartment number (1-indexed)" but this conflicts with how the value is actually used. Throughout the codebase, bolus.input() is used directly as an array index (see src/simulator/equation/ode/mod.rs:274 where bolus_v[bolus.input()] = bolus.amount()). This comment should either be corrected to remove "(1-indexed)" or the entire indexing scheme needs to be fixed.
| /// Get a mutable reference to the compartment number (1-indexed) that receives the bolus | |
| /// Get a mutable reference to the compartment number that receives the bolus |
| } | ||
|
|
||
| /// Set the compartment number (zero-indexed) that receives the infusion | ||
| /// Get a mutable reference to the compartment number (1-indexed) that receives the infusion |
There was a problem hiding this comment.
Similar to the Bolus comment issue, line 233 incorrectly states "(1-indexed)" for the infusion input compartment. The value is used directly as an array index throughout the codebase and should be 0-indexed. This documentation is inconsistent with the actual usage.
| /// Get a mutable reference to the compartment number (1-indexed) that receives the infusion | |
| /// Get a mutable reference to the compartment number (0-indexed) that receives the infusion |
| .ok_or_else(|| DataError::MissingObservationOuteq { | ||
| id: self.id.clone(), | ||
| time: self.time, | ||
| })?, // Keep 1-indexed as provided by Pmetrics |
There was a problem hiding this comment.
Critical indexing inconsistency detected. The new documentation states that input and outeq are kept as 1-indexed values (comment on line 190: "Keep 1-indexed as provided by Pmetrics"), but the code at src/simulator/equation/ode/mod.rs:274 uses bolus.input() directly as an array index: bolus_v[bolus.input()] = bolus.amount().
If input is truly 1-indexed, then:
- input=1 would index bolus_v[1] (second element) instead of bolus_v[0] (first element)
- This would skip the first compartment and cause an off-by-one error
Similarly, at src/simulator/equation/mod.rs:134: x.add_bolus(bolus.input(), bolus.amount()) would be affected.
The old Pmetrics parser converted these to 0-indexed (subtracting 1), but the new DataRow implementation keeps them as-is. This breaks the internal contract that these indices are 0-indexed for array access.
Either:
- The documentation comments are wrong and values should remain 0-indexed internally, OR
- All array indexing code needs to subtract 1 before use
This is a breaking change that will cause incorrect dosing/observation assignment.
| })?, // Keep 1-indexed as provided by Pmetrics | ||
| self.get_errorpoly(), | ||
| 0, // occasion set later | ||
| self.cens.unwrap_or(Censor::None), | ||
| ))); | ||
| } | ||
| 1 | 4 => { | ||
| // Dosing event (1) or reset with dose (4) | ||
|
|
||
| let input = self.input.ok_or_else(|| DataError::MissingBolusInput { | ||
| id: self.id.clone(), | ||
| time: self.time, | ||
| })?; // Keep 1-indexed as provided by Pmetrics |
There was a problem hiding this comment.
The same indexing inconsistency affects outeq. The code uses observation.outeq() directly as an array index (e.g., src/simulator/equation/ode/mod.rs:324: let pred = y_out[observation.outeq()];), but the new documentation states that outeq is kept as 1-indexed.
This will cause observations to be matched with the wrong output equation:
- outeq=1 would index y_out[1] (second element) instead of y_out[0] (first element)
The old Pmetrics parser subtracted 1 to convert to 0-indexed, but the new DataRow implementation keeps them 1-indexed, breaking array access throughout the codebase.
No description provided.