Mitokic 02202026 unit tests by mitokic · Pull Request #205 · microsoft/finnts

mitokic · 2026-02-20T21:08:24Z

This pull request introduces improvements to hyperparameter grid generation and adds comprehensive unit tests for helper functions across multiple files. The most significant changes are the switch from Latin hypercube to space-filling grids for hyperparameter tuning, and the addition of robust tests for data type alignment, agent helper functions, and summarization helpers. These changes enhance both the reliability and maintainability of the codebase.

Hyperparameter grid generation improvements:

Replaced dials::grid_latin_hypercube with dials::grid_space_filling in ensemble_models and grid helper functions to generate more diverse and representative hyperparameter grids. [1] [2]

Testing for agent helper functions:

Added extensive unit tests in test-agent_helpers.R for functions such as collapse_or_na, apply_column_types, does_param_set_exist, extract_json_object, and null_converter, covering various edge cases and type coercion scenarios.

Testing for summarization helpers:

Introduced tests in test-agent_summarize_helpers.R for summarization-related helper functions, including .chr1, .kv, .unquote, .signif_chr, .extract_predictors, .extract_outcomes, .extract_recipe_steps, .infer_period_from_dates, .find_obj, and .assemble_output, ensuring correct handling of inputs and outputs.

Testing for data preparation and alignment:

Added tests in test-align_types_prep_data.R for align_types, get_xregs_future_values_tbl, clean_outliers_missing_values, and make_stationary, verifying correct data type alignment, handling of external regressors, missing values, outliers, and stationarity transformations.

Copilot

Pull request overview

This PR updates the hyperparameter grid generation strategy (moving to space-filling grids) and adds a broad set of unit tests across core helper modules to improve correctness and regression detection across the forecasting workflow.

Changes:

Switches ensemble hyperparameter grid generation from Latin hypercube to space-filling grids.
Adds a space-filling grid helper in models.R.
Introduces comprehensive testthat coverage for utility, I/O, project/run info, parallel utilities, model workflows, feature selection, hierarchy helpers, and agent helper functions.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
R/ensemble_models.R	Uses `dials::grid_space_filling()` for ensemble hyperparameter grid creation.
R/models.R	Replaces the internal Latin-hypercube grid helper with a space-filling grid helper.
tests/testthat/test-utility.R	Adds unit tests for `get_timestamp()`.
tests/testthat/test-run_info.R	Adds tests for `set_run_info()` / `get_run_info()` defaults and validation.
tests/testthat/test-read_write_data.R	Adds tests for hashing, read/write helpers, and file listing utilities.
tests/testthat/test-project_info.R	Adds tests for `set_project_info()` defaults, validation, and overwrite behavior.
tests/testthat/test-prep_models_helpers.R	Adds tests for prep-model helper functions (spacing/frequency/seasonality/backtest periods).
tests/testthat/test-prep_data_helpers.R	Adds tests for prep-data helper functions (frequency, Fourier/lag/rolling/recipes/date regex, transforms).
tests/testthat/test-parallel_util.R	Adds tests for core parallel setup/teardown helpers.
tests/testthat/test-multistep_models.R	Adds tests for multistep parsnip model specs (print/update/translate).
tests/testthat/test-multistep_helper.R	Adds tests for multistep horizon helper functions (xreg checks, lag selection, feature selection).
tests/testthat/test-models.R	Adds extensive tests for model listing, recipes/workflows, resampling, tuning helpers, and train-model utilities.
tests/testthat/test-input_checks.R	Adds tests for input validation helpers and error messaging.
tests/testthat/test-hierarchical.R	Extends hierarchy tests (summaries, HTS objects/nodes, adjust_df behavior).
tests/testthat/test-final_models_helpers.R	Adds tests for prediction intervals, weekly-to-daily conversion, and output cleanups.
tests/testthat/test-feature_selection.R	Adds tests for correlation and variable-importance feature selection helpers.
tests/testthat/test-align_types_prep_data.R	Adds tests for type alignment and several prep-data cleaning/stationarity helpers.
tests/testthat/test-agent_summarize_helpers.R	Adds tests for agent model-summarization helper utilities.
tests/testthat/test-agent_helpers.R	Adds tests for agent iteration helpers (type coercion, param matching, JSON extraction, NULL handling).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-20T21:17:59Z

tests/testthat/test-agent_helpers.R

+test_that("extract_json_object parses simple JSON", {
+  raw <- '{"key": "value", "num": 42}'
+  result <- extract_json_object(raw)
+  expect_equal(result$key, "value")
+  expect_equal(result$num, 42)
+})
+
+test_that("extract_json_object strips code fences", {
+  raw <- '```json\n{"key": "value"}\n```'
+  result <- extract_json_object(raw)
+  expect_equal(result$key, "value")
+})
+
+test_that("extract_json_object handles surrounding text", {
+  raw <- 'Here is the JSON: {"answer": "yes"} and some more text.'
+  result <- extract_json_object(raw)
+  expect_equal(result$answer, "yes")
+})


These extract_json_object() tests implicitly require jsonlite to be installed (the implementation calls jsonlite::fromJSON()), but jsonlite is only in Suggests. To avoid check/CI failures when Suggests aren’t installed, add skip_if_not_installed("jsonlite") for this block (or move jsonlite to Imports if it’s required at runtime).

Copilot · 2026-02-20T21:18:00Z

tests/testthat/test-parallel_util.R

+test_that("par_start with local_machine creates cluster", {
+  run_info <- list(
+    storage_object = NULL,
+    path = tempdir(),
+    data_output = "csv",
+    object_output = "rds"
+  )
+
+  par_info <- par_start(
+    run_info = run_info,
+    parallel_processing = "local_machine",
+    num_cores = 2,
+    task_length = 5
+  )
+
+  expect_false(is.null(par_info$cl))
+  expect_true(is.function(par_info$foreach_operator))
+
+  # clean up
+  par_end(par_info$cl)
+})


par_start(parallel_processing = "local_machine") will call parallel::makeCluster(min(cores, task_length)) where cores comes from detectCores() - 1. On single-core environments this becomes 0 and makeCluster(0) errors, making this test (and the code path) unreliable. Consider guarding the test with skip_if(parallel::detectCores() <= 1) or updating get_cores()/par_start() to ensure at least 1 worker.

Copilot · 2026-02-20T21:18:00Z

tests/testthat/test-utility.R

+  # should be within 5 seconds of now
+
+  expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)


This assertion compares get_timestamp() to the current system time with a 5-second window. On slow or heavily loaded CI runners, this can be flaky. Consider freezing time for the test (e.g., via withr) or widening the tolerance to reduce intermittent failures.

Suggested change

# should be within 5 seconds of now

expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)

# should be within 60 seconds of now to avoid flaky failures on slow CI

expect_true(abs(difftime(ts, now_utc, units = "secs")) < 60)

Copilot · 2026-02-20T21:18:00Z

tests/testthat/test-feature_selection.R

+test_that("target_corr_fn returns correlated features", {
+  set.seed(123)
+  n <- 50
+  x1 <- rnorm(n)
+  data <- tibble::tibble(
+    Combo = rep("A", n),
+    Date = seq.Date(as.Date("2020-01-01"), by = "month", length.out = n),
+    Target = x1 * 2 + rnorm(n, sd = 0.1),
+    Feature_A = x1,
+    Feature_B = rnorm(n)
+  )
+
+  result <- target_corr_fn(data, threshold = 0.5)
+
+  expect_true("term" %in% colnames(result))
+  expect_true("Target" %in% colnames(result))
+  expect_true("Feature_A" %in% result$term)
+})


target_corr_fn() depends on the corrr package, which is listed under Suggests. If Suggests aren’t installed in the test environment, these tests will fail. Consider adding skip_if_not_installed("corrr") for the target_corr_fn test blocks (or moving corrr to Imports if it’s required at runtime).

Copilot · 2026-02-20T21:18:00Z

tests/testthat/test-feature_selection.R

+test_that("vip_rf_fn produces variable importance scores", {
+  set.seed(42)
+  data <- tibble::tibble(
+    Date = seq(as.Date("2020-01-01"), by = "day", length.out = 100),
+    Target = rnorm(100, mean = 50, sd = 10),
+    x1 = rnorm(100),
+    x2 = rnorm(100),
+    x3 = rnorm(100)
+  )
+  data$Target <- data$Target + 2 * data$x1
+
+  result <- vip_rf_fn(data, seed = 42)
+  expect_s3_class(result, "tbl_df")
+  expect_true("Variable" %in% names(result))
+  expect_true("Importance" %in% names(result))
+  expect_true(all(result$Importance > 0))
+})


vip_rf_fn() fits a ranger model (set_engine("ranger")) and calls vip::vi(), but ranger is not listed in this package’s Imports/Suggests in DESCRIPTION. This will make the test suite fail in environments where ranger isn’t installed. Either add ranger to Suggests/Imports, or skip these tests when ranger (and vip) aren’t installed (e.g., skip_if_not_installed()).

mitokic · 2026-02-21T01:08:25Z

@copilot open a new pull request to apply changes based on the comments in this thread

Copilot · 2026-02-21T01:08:33Z

@mitokic I've opened a new pull request, #206, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: mitokic <16709433+mitokic@users.noreply.github.com>

Fix flaky/fragile tests: add skip guards for optional dependencies and widen timestamp tolerance

Copilot

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-21T01:17:19Z

tests/testthat/test-run_info.R

+
+# -- utility.R tests --
+
+test_that("get_timestamp returns a POSIXct object in UTC", {
+  ts <- get_timestamp()
+
+  expect_s3_class(ts, "POSIXct")
+  expect_equal(attr(ts, "tzone"), "UTC")
+})
+
+test_that("get_timestamp returns a timestamp close to current time", {
+  ts <- get_timestamp()
+  now_utc <- as.POSIXct(format(Sys.time(), tz = "UTC"), tz = "UTC")
+
+  # should be within 5 seconds of now
+
+  expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)
+})
+
+test_that("get_timestamp format is YYYYMMDDTHHMMSSZ", {
+  ts <- get_timestamp()
+  formatted <- format(ts, "%Y%m%dT%H%M%SZ")
+
+  # should be parseable back
+  parsed <- as.POSIXct(formatted, format = "%Y%m%dT%H%M%SZ", tz = "UTC")
+  expect_false(is.na(parsed))
+})


The get_timestamp tests on lines 193-216 are duplicates of tests already in test-utility.R. These should be removed from this file.

Additionally, note that there's an inconsistency: test-utility.R uses a 60-second tolerance (line 15) which is more appropriate for CI environments, while this file uses a 5-second tolerance (line 206) which may cause flaky test failures on slower systems.

Suggested change

# -- utility.R tests --

test_that("get_timestamp returns a POSIXct object in UTC", {

ts <- get_timestamp()

expect_s3_class(ts, "POSIXct")

expect_equal(attr(ts, "tzone"), "UTC")

})

test_that("get_timestamp returns a timestamp close to current time", {

ts <- get_timestamp()

now_utc <- as.POSIXct(format(Sys.time(), tz = "UTC"), tz = "UTC")

# should be within 5 seconds of now

expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)

})

test_that("get_timestamp format is YYYYMMDDTHHMMSSZ", {

ts <- get_timestamp()

formatted <- format(ts, "%Y%m%dT%H%M%SZ")

# should be parseable back

parsed <- as.POSIXct(formatted, format = "%Y%m%dT%H%M%SZ", tz = "UTC")

expect_false(is.na(parsed))

})

mitokic added 2 commits February 20, 2026 12:20

add new unit tests

4e70231

replace dials::grid_latin_hypercube() -> dials::grid_space_filling()

b4c68e5

mitokic requested a review from Copilot February 20, 2026 21:08

Copilot started reviewing on behalf of mitokic February 20, 2026 21:08 View session

bump version

e43c3b7

Copilot AI reviewed Feb 20, 2026

View reviewed changes

mitokic added 2 commits February 20, 2026 15:57

add unit tests

27a302a

update docs and fix issue when trying to pull hyperparams

4bf7179

Initial plan

93a3d71

Copilot AI mentioned this pull request Feb 21, 2026

Fix flaky/fragile tests: add skip guards for optional dependencies and widen timestamp tolerance #206

Merged

Copilot AI and others added 2 commits February 21, 2026 01:11

Apply reviewer feedback: add skip guards and fix flaky timestamp test

f13abca

Co-authored-by: mitokic <16709433+mitokic@users.noreply.github.com>

Merge pull request #206 from microsoft/copilot/sub-pr-205

47c722b

Fix flaky/fragile tests: add skip guards for optional dependencies and widen timestamp tolerance

mitokic requested a review from Copilot February 21, 2026 01:12

Copilot started reviewing on behalf of mitokic February 21, 2026 01:13 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mitokic 02202026 unit tests#205

Mitokic 02202026 unit tests#205
mitokic wants to merge 8 commits intomainfrom
mitokic-02202026-unit-tests

mitokic commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 20, 2026

Uh oh!

Copilot AI Feb 20, 2026

Uh oh!

Copilot AI Feb 20, 2026

Uh oh!

Copilot AI Feb 20, 2026

Uh oh!

Copilot AI Feb 20, 2026

Uh oh!

mitokic commented Feb 21, 2026

Uh oh!

Copilot AI commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# should be within 5 seconds of now

		expect_true(abs(difftime(ts, now_utc, units = "secs")) < 5)

Conversation

mitokic commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

mitokic commented Feb 21, 2026

Uh oh!

Copilot AI commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants