Skip to content

Mitokic 02202026 unit tests#205

Open
mitokic wants to merge 8 commits intomainfrom
mitokic-02202026-unit-tests
Open

Mitokic 02202026 unit tests#205
mitokic wants to merge 8 commits intomainfrom
mitokic-02202026-unit-tests

Conversation

@mitokic
Copy link
Collaborator

@mitokic mitokic commented Feb 20, 2026

This pull request introduces improvements to hyperparameter grid generation and adds comprehensive unit tests for helper functions across multiple files. The most significant changes are the switch from Latin hypercube to space-filling grids for hyperparameter tuning, and the addition of robust tests for data type alignment, agent helper functions, and summarization helpers. These changes enhance both the reliability and maintainability of the codebase.

Hyperparameter grid generation improvements:

  • Replaced dials::grid_latin_hypercube with dials::grid_space_filling in ensemble_models and grid helper functions to generate more diverse and representative hyperparameter grids. [1] [2]

Testing for agent helper functions:

  • Added extensive unit tests in test-agent_helpers.R for functions such as collapse_or_na, apply_column_types, does_param_set_exist, extract_json_object, and null_converter, covering various edge cases and type coercion scenarios.

Testing for summarization helpers:

  • Introduced tests in test-agent_summarize_helpers.R for summarization-related helper functions, including .chr1, .kv, .unquote, .signif_chr, .extract_predictors, .extract_outcomes, .extract_recipe_steps, .infer_period_from_dates, .find_obj, and .assemble_output, ensuring correct handling of inputs and outputs.

Testing for data preparation and alignment:

  • Added tests in test-align_types_prep_data.R for align_types, get_xregs_future_values_tbl, clean_outliers_missing_values, and make_stationary, verifying correct data type alignment, handling of external regressors, missing values, outliers, and stationarity transformations.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the hyperparameter grid generation strategy (moving to space-filling grids) and adds a broad set of unit tests across core helper modules to improve correctness and regression detection across the forecasting workflow.

Changes:

  • Switches ensemble hyperparameter grid generation from Latin hypercube to space-filling grids.
  • Adds a space-filling grid helper in models.R.
  • Introduces comprehensive testthat coverage for utility, I/O, project/run info, parallel utilities, model workflows, feature selection, hierarchy helpers, and agent helper functions.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
R/ensemble_models.R Uses dials::grid_space_filling() for ensemble hyperparameter grid creation.
R/models.R Replaces the internal Latin-hypercube grid helper with a space-filling grid helper.
tests/testthat/test-utility.R Adds unit tests for get_timestamp().
tests/testthat/test-run_info.R Adds tests for set_run_info() / get_run_info() defaults and validation.
tests/testthat/test-read_write_data.R Adds tests for hashing, read/write helpers, and file listing utilities.
tests/testthat/test-project_info.R Adds tests for set_project_info() defaults, validation, and overwrite behavior.
tests/testthat/test-prep_models_helpers.R Adds tests for prep-model helper functions (spacing/frequency/seasonality/backtest periods).
tests/testthat/test-prep_data_helpers.R Adds tests for prep-data helper functions (frequency, Fourier/lag/rolling/recipes/date regex, transforms).
tests/testthat/test-parallel_util.R Adds tests for core parallel setup/teardown helpers.
tests/testthat/test-multistep_models.R Adds tests for multistep parsnip model specs (print/update/translate).
tests/testthat/test-multistep_helper.R Adds tests for multistep horizon helper functions (xreg checks, lag selection, feature selection).
tests/testthat/test-models.R Adds extensive tests for model listing, recipes/workflows, resampling, tuning helpers, and train-model utilities.
tests/testthat/test-input_checks.R Adds tests for input validation helpers and error messaging.
tests/testthat/test-hierarchical.R Extends hierarchy tests (summaries, HTS objects/nodes, adjust_df behavior).
tests/testthat/test-final_models_helpers.R Adds tests for prediction intervals, weekly-to-daily conversion, and output cleanups.
tests/testthat/test-feature_selection.R Adds tests for correlation and variable-importance feature selection helpers.
tests/testthat/test-align_types_prep_data.R Adds tests for type alignment and several prep-data cleaning/stationarity helpers.
tests/testthat/test-agent_summarize_helpers.R Adds tests for agent model-summarization helper utilities.
tests/testthat/test-agent_helpers.R Adds tests for agent iteration helpers (type coercion, param matching, JSON extraction, NULL handling).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 130 to 147
test_that("extract_json_object parses simple JSON", {
raw <- '{"key": "value", "num": 42}'
result <- extract_json_object(raw)
expect_equal(result$key, "value")
expect_equal(result$num, 42)
})

test_that("extract_json_object strips code fences", {
raw <- '```json\n{"key": "value"}\n```'
result <- extract_json_object(raw)
expect_equal(result$key, "value")
})

test_that("extract_json_object handles surrounding text", {
raw <- 'Here is the JSON: {"answer": "yes"} and some more text.'
result <- extract_json_object(raw)
expect_equal(result$answer, "yes")
})
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These extract_json_object() tests implicitly require jsonlite to be installed (the implementation calls jsonlite::fromJSON()), but jsonlite is only in Suggests. To avoid check/CI failures when Suggests aren’t installed, add skip_if_not_installed("jsonlite") for this block (or move jsonlite to Imports if it’s required at runtime).

Copilot uses AI. Check for mistakes.
Comment on lines 65 to 85
test_that("par_start with local_machine creates cluster", {
run_info <- list(
storage_object = NULL,
path = tempdir(),
data_output = "csv",
object_output = "rds"
)

par_info <- par_start(
run_info = run_info,
parallel_processing = "local_machine",
num_cores = 2,
task_length = 5
)

expect_false(is.null(par_info$cl))
expect_true(is.function(par_info$foreach_operator))

# clean up
par_end(par_info$cl)
})
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

par_start(parallel_processing = "local_machine") will call parallel::makeCluster(min(cores, task_length)) where cores comes from detectCores() - 1. On single-core environments this becomes 0 and makeCluster(0) errors, making this test (and the code path) unreliable. Consider guarding the test with skip_if(parallel::detectCores() <= 1) or updating get_cores()/par_start() to ensure at least 1 worker.

Copilot uses AI. Check for mistakes.
Comment on lines 14 to 16
# should be within 5 seconds of now

expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion compares get_timestamp() to the current system time with a 5-second window. On slow or heavily loaded CI runners, this can be flaky. Consider freezing time for the test (e.g., via withr) or widening the tolerance to reduce intermittent failures.

Suggested change
# should be within 5 seconds of now
expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)
# should be within 60 seconds of now to avoid flaky failures on slow CI
expect_true(abs(difftime(ts, now_utc, units = "secs")) < 60)

Copilot uses AI. Check for mistakes.
Comment on lines 6 to 23
test_that("target_corr_fn returns correlated features", {
set.seed(123)
n <- 50
x1 <- rnorm(n)
data <- tibble::tibble(
Combo = rep("A", n),
Date = seq.Date(as.Date("2020-01-01"), by = "month", length.out = n),
Target = x1 * 2 + rnorm(n, sd = 0.1),
Feature_A = x1,
Feature_B = rnorm(n)
)

result <- target_corr_fn(data, threshold = 0.5)

expect_true("term" %in% colnames(result))
expect_true("Target" %in% colnames(result))
expect_true("Feature_A" %in% result$term)
})
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target_corr_fn() depends on the corrr package, which is listed under Suggests. If Suggests aren’t installed in the test environment, these tests will fail. Consider adding skip_if_not_installed("corrr") for the target_corr_fn test blocks (or moving corrr to Imports if it’s required at runtime).

Copilot uses AI. Check for mistakes.
Comment on lines 74 to 90
test_that("vip_rf_fn produces variable importance scores", {
set.seed(42)
data <- tibble::tibble(
Date = seq(as.Date("2020-01-01"), by = "day", length.out = 100),
Target = rnorm(100, mean = 50, sd = 10),
x1 = rnorm(100),
x2 = rnorm(100),
x3 = rnorm(100)
)
data$Target <- data$Target + 2 * data$x1

result <- vip_rf_fn(data, seed = 42)
expect_s3_class(result, "tbl_df")
expect_true("Variable" %in% names(result))
expect_true("Importance" %in% names(result))
expect_true(all(result$Importance > 0))
})
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vip_rf_fn() fits a ranger model (set_engine("ranger")) and calls vip::vi(), but ranger is not listed in this package’s Imports/Suggests in DESCRIPTION. This will make the test suite fail in environments where ranger isn’t installed. Either add ranger to Suggests/Imports, or skip these tests when ranger (and vip) aren’t installed (e.g., skip_if_not_installed()).

Copilot uses AI. Check for mistakes.
@mitokic
Copy link
Collaborator Author

mitokic commented Feb 21, 2026

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Feb 21, 2026

@mitokic I've opened a new pull request, #206, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits February 21, 2026 01:11
Co-authored-by: mitokic <16709433+mitokic@users.noreply.github.com>
Fix flaky/fragile tests: add skip guards for optional dependencies and widen timestamp tolerance
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +190 to +216

# -- utility.R tests --

test_that("get_timestamp returns a POSIXct object in UTC", {
ts <- get_timestamp()

expect_s3_class(ts, "POSIXct")
expect_equal(attr(ts, "tzone"), "UTC")
})

test_that("get_timestamp returns a timestamp close to current time", {
ts <- get_timestamp()
now_utc <- as.POSIXct(format(Sys.time(), tz = "UTC"), tz = "UTC")

# should be within 5 seconds of now

expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)
})

test_that("get_timestamp format is YYYYMMDDTHHMMSSZ", {
ts <- get_timestamp()
formatted <- format(ts, "%Y%m%dT%H%M%SZ")

# should be parseable back
parsed <- as.POSIXct(formatted, format = "%Y%m%dT%H%M%SZ", tz = "UTC")
expect_false(is.na(parsed))
})
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_timestamp tests on lines 193-216 are duplicates of tests already in test-utility.R. These should be removed from this file.

Additionally, note that there's an inconsistency: test-utility.R uses a 60-second tolerance (line 15) which is more appropriate for CI environments, while this file uses a 5-second tolerance (line 206) which may cause flaky test failures on slower systems.

Suggested change
# -- utility.R tests --
test_that("get_timestamp returns a POSIXct object in UTC", {
ts <- get_timestamp()
expect_s3_class(ts, "POSIXct")
expect_equal(attr(ts, "tzone"), "UTC")
})
test_that("get_timestamp returns a timestamp close to current time", {
ts <- get_timestamp()
now_utc <- as.POSIXct(format(Sys.time(), tz = "UTC"), tz = "UTC")
# should be within 5 seconds of now
expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)
})
test_that("get_timestamp format is YYYYMMDDTHHMMSSZ", {
ts <- get_timestamp()
formatted <- format(ts, "%Y%m%dT%H%M%SZ")
# should be parseable back
parsed <- as.POSIXct(formatted, format = "%Y%m%dT%H%M%SZ", tz = "UTC")
expect_false(is.na(parsed))
})

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants