45 handling of relative errors#46
Conversation
archaeothommy
left a comment
There was a problem hiding this comment.
@Abagna123 Maybe it is easier this way to trace down what has to be changed.
In addition to the comments:
- Look into
sapply()for e.g. retrieving the column attribute for each column. For example,any(sapply(colnames(df), is_err_percent))tells you if at least one column in an ASTR dataset is a relative error. - Try to find a way to avoid the loop. The functions we worked on before should provide some good ideas.
- Provide tests and examples.
| # Find all error columns | ||
| error_cols <- get_error_columns(df) | ||
|
|
||
| if (length(error_cols) == 0) { |
There was a problem hiding this comment.
@Abagna123 This test will never fail because error_cols will always include the ID columns.
Before this check, remove columns with absolute errors. Look into the functions "is_err_abs()" and "is_err_percent()" for this.
| # Check if error is relative (has % units) | ||
| err_unit <- units::deparse_unit(df[[err_col]]) | ||
|
|
||
| if (!err_unit %in% c("%", "atP", "wtP")) { |
There was a problem hiding this comment.
@Abagna123: This is not necessary, see comment above.
| # Set units to match the concentration column | ||
| units(df[[err_col]]) <- units(df[[base_name]]) | ||
|
|
||
| # Update ASTR_class to mark as absolute error |
There was a problem hiding this comment.
@Abagna123 We won't introduce additional classes for this distinction. There are already functions for testing this, see comment above.
|
|
||
| # Find all error columns (both regular and absolute-marked) | ||
| error_cols <- get_error_columns(df, "ASTR_error") | ||
| abs_error_cols <- get_cols_with_ac_class(df, "ASTR_error_absolute") |
| next | ||
| } | ||
|
|
||
| # Check if error is absolute (same unit as concentration) |
There was a problem hiding this comment.
@Abagna123 You pass only absolute columns into the loop. Therefore, is this test really necessary?
|
@archaeothommy I removed the /100 and *100 factors from rel_to_abs() and abs_to_rel() because the units package in R already handles percentage scaling automatically via the UDUNITS database. When read_ASTR() reads a column with % in the name (e.g., SiO2_errSD%), it stores the numeric value (e.g., 4.3) with a % unit attached. The UDUNITS database defines % as 0.01. As a result, when I extract as.numeric() on that column, it returns 0.043 (already divided by 100). |
* handling edge cases * inclusion in parsing workflow * update implementation documentation and vignette
|
Updated the functions to
Found and fixed a bug that |
|
Nice - good job, folks! |
No description provided.