Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
95628c4
packyears bug fix
StaceyFisher Apr 1, 2025
309db6d
Added test for "former daily current occasional smoker" for pack years
yulric Apr 19, 2025
0d70d01
Fix logo font rendering issues and regenerate favicons
DougManuel Jun 4, 2025
2d38302
Merge pull request #136 from Big-Life-Lab/pkgdown-fix
DougManuel Jun 4, 2025
9e27eea
revert to updated logo
DougManuel Jun 4, 2025
13eb0b5
update logo -- again
DougManuel Jun 4, 2025
dbb21f1
build site
DougManuel Jun 4, 2025
31b71e1
Merge pull request #134 from Big-Life-Lab/packyears-bug
DougManuel Jun 4, 2025
b71b313
Add github actions R-CMD-check.yaml
DougManuel Jun 5, 2025
bb903c6
Merge remote-tracking branch 'origin/main'
DougManuel Jun 5, 2025
f8b5f2d
Fix: Add Author and Maintainer to DESCRIPTION
DougManuel Jul 1, 2025
6a7bc8e
fix: Comment out failing tests for legacy functions
DougManuel Jul 2, 2025
3fe65fd
Fix legacy function tests and syntax errors
DougManuel Jul 2, 2025
8f639bd
Fix legacy function tests and syntax errors
DougManuel Jul 2, 2025
559af3a
Merge remote-tracking branch 'origin/main'
DougManuel Jul 2, 2025
9ac4080
feat(infrastructure): Add 3-step derived variable architecture
DougManuel Jan 20, 2026
58033fb
feat(smoking): Add unified smoking derived variable functions
DougManuel Jan 20, 2026
c067c33
feat(worksheets): Add smoking harmonization worksheets
DougManuel Jan 20, 2026
7912bf4
test(smoking): Add unit tests for smoking derived variables
DougManuel Jan 20, 2026
99f4333
docs(smoking): Add CEP-002 smoking harmonization specification
DougManuel Jan 21, 2026
d9adb0d
feat(worksheets): Merge smoking variables into main worksheets
DougManuel Jan 21, 2026
164577f
fix(worksheets): Add explicit 2015+ mappings for SMK_06C and SMK_09C
DougManuel Jan 26, 2026
62fa609
Fix CSV formatting for GitHub Actions
DougManuel Jan 27, 2026
1c96a14
fix: Restore table-generators.R lost during stash merge
DougManuel Feb 11, 2026
2f82fe8
refactor(smoking): Harmonisation improvements, semantic parameters, w…
DougManuel Feb 24, 2026
ab9f333
refactor: Consolidate utility infrastructure into clean-variables.R
DougManuel Feb 24, 2026
48a5af7
fix: Remove library() calls and here:: dependency at package load time
DougManuel Mar 11, 2026
2000cba
fix(worksheets): Align databaseStart coverage for smoking variables
DougManuel Mar 11, 2026
490bf80
feat(smoking): Add SMK_01B (ever smoked a whole cigarette) to v3
DougManuel Mar 11, 2026
75b59a0
fix(smoking): Correct database coverage for SMK_204, SMK_208, and pac…
DougManuel Mar 12, 2026
109f72d
feat(validation): Add check_recode_blocks() and fix SMK_09A_cont
DougManuel Mar 13, 2026
437fe36
feat(validation): Add check_invalid_databases() and update review skill
DougManuel Mar 13, 2026
c4718a9
refactor(smoking): rename SMK_09A family and add time_quit_smoking_co…
DougManuel Mar 27, 2026
0590468
refactor(skill): Split cchsflow-review SKILL.md into orchestrator + docs
DougManuel Mar 29, 2026
7951fca
Revert "refactor(skill): Split cchsflow-review SKILL.md into orchestr…
DougManuel Mar 29, 2026
5637d82
Rewrite cessation functions and add v3 smoking aliases
DougManuel Mar 30, 2026
ac4b3ab
Fix smoking worksheet coverage and trailing spaces
DougManuel Mar 30, 2026
e7717b4
Update cessation tests for rewritten functions
DougManuel Mar 30, 2026
38e7d54
fix(smoking): Audit fixes, DHH_AGE Master support, and clean-variable…
DougManuel Mar 30, 2026
2be4c58
Delete calculate_SMK_06A_cont() — worksheet-first principle
DougManuel Apr 3, 2026
4eb159a
fix: handle recStart "copy" as valid range in is_value_in_range()
DougManuel Apr 6, 2026
4f8c531
refactor(smoking): Rename _A/_B era-split variables to year-based names
DougManuel Apr 7, 2026
8d77adf
docs(cep-002): Update stale references and address review comments
DougManuel Apr 7, 2026
889fa6b
fix(smoking): Remove invalid cchs2022_p and cchs2023_p from SMK_06A_2…
DougManuel Apr 8, 2026
b81f872
Corrected databaseStart and variableStart for raw SMK variables in va…
rafdoodle Apr 14, 2026
c1746f5
Corrected databaseStart for derived SMK variables in variables.csv (s…
rafdoodle Apr 14, 2026
138b241
Bringing in updated Claude skills from skills branch
rafdoodle Apr 16, 2026
2a17c52
Used updated Claude skills to correct databaseStart and variableStart…
rafdoodle Apr 17, 2026
05e3446
fix(smoking): correct era blocks and _m cycle mappings for 12 SMK var…
rafdoodle Apr 17, 2026
42f8540
Corrected databaseStart and variableStart for SMKD and SMKG variables…
rafdoodle Apr 17, 2026
9ab2b13
Fixed variable name labels and dummyVariable column for smoking varia…
rafdoodle Apr 18, 2026
b4cc855
Fixed category labels and units and for select smoking variable rows …
rafdoodle Apr 18, 2026
d184a9c
Added tests for calculate_SMKG functions and ensured all smoking test…
rafdoodle Apr 18, 2026
9b6b88a
Refactored variableStart for SMK_040, SMK_203, SMK_207, SMKG040_cont,…
rafdoodle Apr 20, 2026
102d084
Brought in DHHGAGE variant changes from v3 branch
rafdoodle Apr 20, 2026
18fb319
Updated worksheet conventions skill; used it to make smoking NA row l…
rafdoodle Apr 20, 2026
faff881
Capitalized "Smoking" subject and corrected notes on "PUMF only" for …
rafdoodle Apr 20, 2026
8b82f30
docs(cep-002): Address stale review comments on derived-functions.qmd
DougManuel Apr 22, 2026
f5ddedd
fix: R CMD check cleanup for merge readiness
DougManuel Apr 22, 2026
8cd20be
Merge remote-tracking branch 'origin/main' into v3-smoking
DougManuel Apr 22, 2026
7944b52
Eliminated SMKDVSTP from time_quit_smoking variants
rafdoodle Apr 24, 2026
e92e18a
Added Master coverage for SMK_10A_cont for time_quit_smoking_complete
rafdoodle Apr 24, 2026
c1739ba
Ensured full coverage for all derived smoking variables, as well as p…
rafdoodle Apr 25, 2026
f934ebf
Improved SMKDSTY_2009plus variableStart, rebuilt .RData files from up…
rafdoodle Apr 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
^renv$
^renv\.lock$
^cchsflow\.Rproj$
^\.Rproj\.user$
^_pkgdown\.yml$
Expand All @@ -10,3 +12,4 @@
^CRAN-RELEASE$
^CODE_OF_CONDUCT.md
^.github
^\.github$
182 changes: 182 additions & 0 deletions .claude/skills/cchsflow-derive/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
---
name: cchsflow-derive
description: Write and review derived variable functions for cchsflow. Use when implementing new DV functions (calculate_*, assess_*, categorize_*), upgrading existing functions to v3 architecture, reviewing DV code for correctness, or preparing DV changes for commit. Covers the 3-step architecture, source-agnostic design, quality tiers, patterns, testing, and package-level validation.
allowed-tools: Bash(Rscript:*), Bash(R:*), Bash(git:*), Read, Glob, Grep
---

# cchsflow derived variable development

Write, review, and validate derived variable functions using the v3 3-step architecture.

## Usage

```
/cchsflow-derive # general guidance (reads foundations)
/cchsflow-derive calculate_bmi # review/write a specific function
/cchsflow-derive --check # run done criteria checks
```

## Before you start

### Required reading

Before writing or reviewing a DV function, read these docs (in this skill's `docs/` folder):

1. **[foundations.md](docs/foundations.md)** — 3-step architecture, missing data handling, quality tiers, coding standards, anti-patterns. Read this first.
2. **The pattern doc** that matches your function (see "Choose a pattern" below)

### Choose a pattern

Identify which pattern your function follows, then read the corresponding doc:

| Pattern | When to use | Doc |
|---------|-------------|-----|
| **Formula calculation** | Compute a value from inputs (BMI, pack-years) | [formula-calculation.md](docs/patterns/formula-calculation.md) |
| **Category grouping** | Map values to categories (BMI categories, smoking status) | [category-grouping.md](docs/patterns/category-grouping.md) |
| **Pass-through** | Clean and forward a single variable | [pass-through.md](docs/patterns/pass-through.md) |
| **Cat-to-continuous** | Midpoint imputation from categorical ranges | [cat-to-continuous.md](docs/patterns/cat-to-continuous.md) |
| **Multi-source routing** | Choose best source with priority chain | [multi-source-routing.md](docs/patterns/multi-source-routing.md) |
| **Pathway branching** | Complex decision tree with gate variables | [pathway-branching.md](docs/patterns/pathway-branching.md) |

### Reference material

- **[7-levels.md](docs/7-levels.md)** — function complexity taxonomy (L1-L7)
- **[function-inventory.md](docs/function-inventory.md)** — all existing DV functions with pattern, level, and tier
- **[testing.md](docs/testing.md)** — unit test and golden fixture patterns, common failure diagnostics

## Development workflow

### 1. Write tests

Follow the test tier matching your function's quality tier (see [testing.md](docs/testing.md)):

- **Bronze**: Happy path + one missing input
- **Silver**: + out-of-range, vectors, dataframe via `mutate()`
- **Gold**: + every `case_when()` branch, tagged NA type verification, `output_format` parameter

### 2. Write the function

Follow the pattern template from the appropriate pattern doc. Key principles:

- **Source-agnostic**: Semantic parameter names (`height_m`, `weight_kg`), not CCHS variable names. ONE function for both PUMF and Master; the worksheet routes different source variables to the same parameters.
- **3-step**: `clean_variables(output_format = "tagged_na")` → `case_when()` logic → `clean_variables(output_format = output_format)`
- **Step 1 always uses `"tagged_na"`**: Never pass the user's `output_format` to Step 1 — `any_missing()` in Step 2 won't detect numeric missing codes.
- **Namespace-qualify**: `dplyr::case_when()`, `haven::tagged_na()` — functions must work standalone.

### 3. Write roxygen documentation

Silver and gold tier require the full template (see foundations.md § Documentation):

```r
#' @title [verb phrase]
#' @description [1-2 sentences]
#' @details [implementation notes, PUMF vs Master table if source-agnostic]
#' @param var1 [description]
#' @param output_format Output missing data format: "tagged_na" (default) or "original".
#' @param ... Arguments passed from deprecated aliases.
#' @return [type and range]
#' @examples
#' # Scalar
#' # Vector
#' # Dataframe
#' # Standalone with rec_with_table (in \dontrun{})
#' @references
#' @seealso
#' @export
```

**`@param ...` rule**: If deprecated aliases use `@rdname` pointing to your function and their signature is `function(...)`, you MUST add `@param ... Arguments passed from deprecated aliases.` to your roxygen. Otherwise R CMD check will report "Undocumented arguments in Rd file: '...'".

### 4. Write deprecated aliases (if renaming)

If the function replaces an older function name, add aliases in `R/deprecated-aliases.R`:

```r
#' @rdname new_function_name
#' @export
old_function_name <- function(...) {
.Deprecated("new_function_name",
msg = "old_function_name() is deprecated. Use new_function_name() instead.")
new_function_name(...)
}
```

### 5. Update worksheets (if needed)

If the function is referenced from `variable_details.csv` via `Func::`:

- Update `recEnd` to point to the new function name
- Update `dummyVariable` if function name changed
- Run `Rscript exec/fix-worksheets.R` after any CSV modification
- Rebuild RData if worksheet structure changed (see cchsflow-worksheets skill)

## Done criteria

**Before committing DV function changes, ALL of these must pass.** Run them in order — earlier checks are faster and catch different issues.

### Check 1: Unit tests pass

```r
# From the project root (or worktree root)
Rscript -e 'devtools::load_all(); testthat::test_file("tests/testthat/test-<domain>.R")'
```

Verify: 0 failures for in-scope tests. Pre-existing failures in other test files are acceptable (note them but don't block on them).

### Check 2: R CMD check passes

```r
# Quick check — catches NAMESPACE, roxygen, imports (skips tests/examples)
Rscript -e 'devtools::check(document = FALSE, args = "--no-tests --no-examples --no-vignettes --no-manual")'

# Full check — recommended before PR
Rscript -e 'devtools::check()'
```

Verify: 0 **new** errors/warnings/notes compared to the branch baseline. Common issues caught only here:

- Undocumented `...` from `@rdname` aliases
- Missing NAMESPACE exports
- Broken `@examples`
- Undeclared imports in DESCRIPTION

### Check 3: Worksheet validation (if worksheets changed)

Invoke the `cchsflow-validation` skill, or run manually:

```r
Rscript exec/fix-worksheets.R
```

### Check 4: Roxygen checklist

Verify manually against the template in Step 3 above:

- [ ] `@title`, `@description`, `@details` present
- [ ] All `@param` documented (including `...` if aliases exist)
- [ ] `@examples` includes scalar, vector, dataframe, and `rec_with_table()`
- [ ] `@return` describes type and range
- [ ] `@export` present
- [ ] `@seealso` links related functions

### Check 5: Test coverage checklist

- [ ] Every `case_when()` branch has a test
- [ ] Scalar, vector, and dataframe inputs tested
- [ ] Missing inputs tested (NA, tagged_na("a"), tagged_na("b"))
- [ ] Boundary values tested (for categorization functions)
- [ ] Deprecated aliases tested (expect deprecation warning + correct delegation)

## Cross-references

### Related cchsflow skills

- **cchsflow-review** — PR review of worksheet changes (L0-L6 process). Lives on `skills/review-validation` branch.
- **cchsflow-validation** — programmatic worksheet validation. Lives on `skills/review-validation` branch.
- **cchsflow-worksheets** — worksheet authoring guidance. Lives on `skills/review-validation` branch.

### External references

- R CMD check guidance: `~/github/ai-infrastructure/context/domains/r_packages.md` § "Local verification before committing"
- V3 coding standards: project memory `project_derive_function_standards.md`
- Reference implementations: `bmi_fun()` in `R/bmi.R` (formula), `pack_years_fun()` in `R/smoking.R` (complex)
Loading