Skip to content

sleep#161

Closed
caitlink12 wants to merge 16 commits into
feature/v3.0.0-validation-infrastructurefrom
sleep
Closed

sleep#161
caitlink12 wants to merge 16 commits into
feature/v3.0.0-validation-infrastructurefrom
sleep

Conversation

@caitlink12
Copy link
Copy Markdown

sleep variables SLP_02, SLP_02_A, SLP_03, SLP_03_A, SLP_04, SLP_04_A, SLPG01_A, SLPG01_B, SLPG01_cont were harmonized into the variable and variable_details sheets for master survey cycles 2001, 2011-2012, 2013-2014, and 2015-2016.
master survey cycles 2011-2012, 2013-2014, and 2015-2016 were added to existing variables SLP_02, SLP_3, SLP_04.
master survey cycle 2001 was added to existing variables SLP_02_A, SLP_03_A, SLP_04_A.
master survey cycles 2001 and 2015-2016 were added to existing variable SLPG01_A
SLPG01_B was created for master survey cycles 2011-2012 and 2013-2014
master survey cycles 2001, 2011-2012, 2013-2014, and 2015-2016 were added to existing variable SLPG01_cont

added master survey cycle for SLP_02_A
added master survey cycles for SLP_04
added master survey cycles for SLP_04_A
updated master survey cycle suffix from _i to _m
harmonized SLPG01 across 2001, 2011-2012, 2013-2014,2015-2016 master survey cycles and updated master survey suffix from _i to _m
SLPG01_A and SLPG01_C had same variable categories so these surveys were merged into SLPG01_A. Master survey cycles added to SLPG01_cont.
added master survey cycles from 2001 and 2015-2016 into SLPG01_cont
SLPG01_A and SLPG01_C had same variable categories so these surveys were merged into SLPG01_A.
added master survey cycles from 2001 and 2015-2016 into SLPG01_cont
added SLPG01_cont values for 2011-2012 and 2013-2014 master survey cycles.
Copy link
Copy Markdown
Contributor

@DougManuel DougManuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR correctly adds Master file cycles. Passes all tests, including integration testing.

Sleeping variables are optional content, which means the availability varies from cycle to cycle and from province to province.

This PR changes:

  • Adds Master file cycles (2001, 2011-2012, 2013-2014, 2015-2016) to sleep variables
  • Creates SLPG01_B for Master cycles 2011-2014
  • Correctly maps source variables across eras

Key findings

Optional content warning: Sleep module was Optional Content from 2007-2018. Provincial availability varies by cycle - not all provinces have sleep data in all cycles. Added description notes to all 11 sleep variables to alert users.

Scale incompatibility:

  • 2001 uses 3-point scale (Most/Sometimes/Never) with inverted direction (lower = worse)
  • 2007+ uses 5-point scale (None to All) where higher = worse
  • Correctly handled via _A suffix variables for 2001

Cycles without sleep data (Canada-wide PUMF):

  • 2003, 2005, 2009-2010 - correctly excluded from worksheets

Verification

  • ✓ Variable naming evolution verified across all eras
  • ✓ Missing cycles correctly excluded
  • ✓ Scale incompatibility handled with separate _A variables
  • ✓ Added optional content notes to variables.csv

Documentation

Full review documentation: https://dmanuel.quarto.pub/cep-005-sleep-variables

Notes

Added notes about data availability

Add documentation for PR #161 sleep variable review:
- CEP-005 review documents with PUMF availability analysis
- Integration tests for rec_with_table() validation
- Scale incompatibility documentation (2001 vs 2007+)

Add description notes to 11 sleep variables in variables.csv:
- Sleep module was Optional Content from 2007-2018
- Provincial availability varies by cycle
- 2001 uses incompatible 3-point scale with inverted direction

Worksheets verified correct - no fixes required.
@DougManuel DougManuel deleted the branch feature/v3.0.0-validation-infrastructure January 27, 2026 16:02
@DougManuel DougManuel closed this Jan 27, 2026
Removed exports for functions that don't exist:
- assess_binge_drinking
- assess_drinking_risk_long
- assess_drinking_risk_short
- categorize_bmi

These exports were causing R CMD check to fail with
"undefined exports" errors.
@rafdoodle rafdoodle reopened this Apr 1, 2026
@DougManuel
Copy link
Copy Markdown
Contributor

Code review

Reviewed 10 sleep variables (SLP_02, SLP_02_A, SLP_03, SLP_03_A, SLP_04, SLP_04_A, SLPG01, SLPG01_A, SLPG01_B, SLPG01_cont) for PUMF and Master across 2001 through 2017-2018.

Note: This PR was previously approved (2026-01-27) but 2 commits were pushed after approval, triggering a re-review.

L6 integration test: cross-cycle prevalence

rec_with_table() ran successfully for all PUMF cycles. Sleep is optional content so valid percentages vary by cycle (provincial availability).

Variable cchs2001_p cchs2007_2008_p cchs2011_2012_p cchs2013_2014_p cchs2015_2016_p cchs2017_2018_p
SLP_02 - 3.0% 33.5% 3.0% 40.5% 33.0%
SLP_02_A 98.5% - - - - -
SLP_03 - 3.0% 33.5% 3.0% 40.5% 33.0%
SLP_03_A 98.5% - - - - -
SLP_04 - 3.0% 33.5% 3.0% 40.5% 33.0%
SLP_04_A 98.5% - - - - -
SLPG01 - - 33.5% 3.0% 40.5% 33.0%
SLPG01_A 98.5% 3.0% - - - -
SLPG01_cont 98.5% 3.0% 31.5% 7.0% 39.5% 46.0%

Master (_m) mappings validated by worksheet checks only -- no runtime data available for L6 testing. SLPG01_B is Master-only (cchs2011_2012_m, cchs2013_2014_m).

Issues found

3 PR-introduced issues. 6 pre-existing issues noted but not blocking.

PR-introduced:

  1. SLP_04 else row missing Master databases (L3, confidence 90)
    The else catch-all row for SLP_04 has databaseStart = PUMF + cchs2012_s only, but all other rows (recStart 1-5, 6, [7,9]) include cchs2011_2012_m, cchs2013_2014_m, cchs2015_2016_m. This means unexpected values on Master databases won't be caught by the else->missing mapping.
    Fix: add cchs2011_2012_m, cchs2013_2014_m, cchs2015_2016_m to the else row's databaseStart.

  2. variables.csv missing _m databases for 4 variables (L3, confidence 95)
    Master databases were added in variable_details.csv but not reflected in variables.csv for:

    • SLP_04_A: missing cchs2001_m
    • SLP_04: missing cchs2011_2012_m, cchs2013_2014_m, cchs2015_2016_m
    • SLPG01_A: missing cchs2001_m, cchs2015_2016_m
    • SLPG01_cont: missing cchs2001_m, cchs2011_2012_m, cchs2013_2014_m, cchs2015_2016_m
      This will prevent rec_with_table() from finding these variables when processing Master databases.
  3. SLP_02/SLP_03: cchs2012_s vs cchs2012_m inconsistency (L3, confidence 85)
    variables.csv lists cchs2012_m but variable_details.csv still has cchs2012_s for SLP_02 and SLP_03. The two files should agree. If the intent is Master, both should use cchs2012_m.

Pre-existing (not blocking, for information):

  • SPLG01_C retains deprecated _i databases (cchs2001_i, cchs2015_2016_i)
  • SLPG01_B dummyVariable for [97,99] row uses NA::a instead of expected NA::b
  • SLPG01_cont has typo "convereted" in catStartLabel
  • SLPG01_C listed in variables.csv but has 0 rows in variable_details.csv (orphan entry)
  • cchs2012_s (deprecated _s suffix) present across SLP_02, SLP_03, SLP_04
  • variable_details.csv has 19 trailing empty columns from Excel editing (41 columns vs 22 expected)

GHA status

R-CMD-check failed with 2 ERRORs and 6 WARNINGs. These appear to be pre-existing package-level issues (undefined exports, parse errors) not caused by the sleep worksheet changes. The NAMESPACE commit (7cb75eb) partially addressed this but issues remain.

Checked

Era boundary defaults, databaseStart consistency (variables.csv vs variable_details.csv), PUMF vs Master naming, deprecated _s/_i suffixes, dummyVariable naming, label consistency, and PUMF integration testing.

CEP: ceps/cep-005-sleep/

1. SLP_04 else row: add cchs2011_2012_m, cchs2013_2014_m,
   cchs2015_2016_m (Master dbs were missing from catch-all row)
2. variables.csv: add missing _m databases for SLP_04, SLP_04_A,
   SLPG01_A, SLPG01_cont to match variable_details.csv
3. cchs2012_s -> cchs2012_m in SLP_02, SLP_03, SLP_04
   (deprecated _s suffix replaced in both files)
@DougManuel
Copy link
Copy Markdown
Contributor

All 3 PR-introduced issues from the review have been fixed in daaec8b:

  1. SLP_04 else row: added cchs2011_2012_m, cchs2013_2014_m, cchs2015_2016_m
  2. variables.csv: added missing _m databases for SLP_04, SLP_04_A, SLPG01_A, SLPG01_cont
  3. cchs2012_s → cchs2012_m: replaced deprecated _s suffix in SLP_02, SLP_03, SLP_04 (both files)

Ready to merge. @rafidoole for final checks.

Copy link
Copy Markdown
Collaborator

@rafdoodle rafdoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final SLP Variable Changes

Overview

Action Variable Notes
Renamed SLP_02_ASLP_02_cat3 Name now reflects recoding type
Renamed SLP_03_ASLP_03_cat3 Name now reflects recoding type
Renamed SLP_04_ASLP_04_cat3 Name now reflects recoding type
Renamed SLPG01SLPG01_pre2015 Scoped to pre-2015 PUMF cycles
Renamed SLPG01_ASLPG01_cat12 Name now reflects 12-category scale
Added SLPG01_2015plus New variable for 2015+ PUMF cycles
Deleted SLPG01_B Superseded by SLPG01_pre2015
Deleted SLPG01_C Orphan variable (no rows in variable_details.csv); removed
Modified SLP_02, SLP_03, SLP_04 databaseStart/variableStart expanded
Modified SLPG01_cont databaseStart/variableStart expanded; label fixes

variables.csv changes

Variable Field Before After
SLP_02 databaseStart PUMF cycles only Added master file cycles
SLP_03 databaseStart PUMF cycles only Added master file cycles
SLP_04 databaseStart PUMF cycles only Added master file cycles
SLPG01_pre2015 (was SLPG01) databaseStart All cycles Pre-2015 PUMF cycles only
SLPG01_cat12 (was SLPG01_A) databaseStart 2015+ cycles All cycles including cchs2001_p/m via GENA_03
SLPG01_cont databaseStart Subset of cycles Expanded to full set of supported cycles
SLPG01_2015plus (did not exist) New; covers 2015+ PUMF cycles, 11-category scale
SLPG01_B Present Deleted
SLPG01_C Present Deleted (orphan — no corresponding rows in variable_details.csv)

variable_details.csv changes

Renames & dummyVariable fixes

Variable Rows dummyVariable pattern — Before dummyVariable pattern — After
SLP_02_cat3 (was SLP_02_A) 6 SLP_02_A_cat3_* SLP_02_cat3_*
SLP_03_cat3 (was SLP_03_A) 6 SLP_03_A_cat3_* SLP_03_cat3_*
SLP_04_cat3 (was SLP_04_A) 6 SLP_04_A_cat3_* SLP_04_cat3_*
SLPG01_pre2015 (was SLPG01) 14 SLPG01_cat11_* SLPG01_pre2015_cat11_*
SLPG01_cat12 (was SLPG01_A) 15 SLPG01_A_cat12_* SLPG01_cat12_cat12_*

New rows added

Variable Rows Description
SLPG01_2015plus 14 11-category structure mirroring SLPG01_pre2015; covers cchs2015_2016_p and cchs2017_2018_p via [SLPG005]

Rows deleted

Variable Rows Reason
SLPG01_B 14 Superseded by SLPG01_pre2015
SLPG01_C 0 Orphan variable; existed only in variables.csv with no recoding rows
SLPG01_cont block 3 14 Duplicate master-file recoding block; removed

databaseStart / variableStart updates

Variable Change
SLP_02, SLP_03, SLP_04 Updated to match expanded variables.csv cycles
SLPG01_pre2015 Set to pre-2015 PUMF cycles
SLPG01_cat12 Set to all supported cycles including cchs2001_p/m
SLPG01_cont block 1 Updated to match SLPG01_cat12 cycles

Label corrections

Variable Field Before After
SLPG01_pre2015, recStart = 1 catLabel, catLabelLong, catStartLabel <3 hours <2 hours
SLPG01_cont block 1, all 12 category rows catLabel Hours Proper hour ranges (e.g., <2 hours, 2-<3 hours, …, >= 12 hours)
SLPG01_cont block 2, recStart = 1 catLabel <3 hours <2 hours
SLP_03 NA rows dummyVariable SLP_03_cat3_NA SLP_03_cat5_NA

rafdoodle added a commit that referenced this pull request Apr 10, 2026
@rafdoodle
Copy link
Copy Markdown
Collaborator

Changed manually merged to v3 via commit 699dbff. Will close this PR now.

@rafdoodle rafdoodle closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants