Skip to content

hearing and vision#160

Closed
caitlink12 wants to merge 8 commits into
feature/v3.0.0-validation-infrastructurefrom
hearing
Closed

hearing and vision#160
caitlink12 wants to merge 8 commits into
feature/v3.0.0-validation-infrastructurefrom
hearing

Conversation

@caitlink12
Copy link
Copy Markdown

The hearing variables HUI06, HUI07, HUI07A, HUI08, and HUI09 were included into variables and variable_details sheets. Master survey cycles for variables HUI06, HUI07, HUI08, and HUI09 master survey cycles include 2001, 2003, 2009-2010, and 2013-2014. For variable HUI07A, the master survey cycles include 2001, 2003, 2009-2010, 2013-2014, and 2017-2018.

Note: for survey cycle 2017-2018, the hearing variables are captured using the Washington Group Measure for disability. As such, the 2017-2018 variable WDM_101 is taken as an equivalent variable to HUI07A captured in previous cycles and harmonized to HUI07A.

changed master survey cycle suffixes from _i to _m for HUI06, HUI07, HUI07_A, HUI08, HUI09
@caitlink12 caitlink12 requested review from DougManuel and yulric and removed request for yulric January 21, 2026 18:36
Fix 6 issues in PR #160 HUI worksheets:
- I1: HUI06 databaseStart typo cchs_2009_2010_m -> cchs2009_2010_m
- I2: HUI06 variableStart typo HUAC_06 -> HUIC_06
- I3: HUI09 wrong source variable [HUI_08] -> [HUI_09]
- I4: All HUI06-09 rows had underscore typo in database names
- I5: HUI07A missing 2005 and 2007-2008 cycles
- I6: HUI09 missing 2011-2012 cycle

Add CEP-004 documentation with validation showing:
- 2003 PUMF has no Ontario data for HUI hearing/vision
- HUI 6-point scale vs Washington Group 4-point scale (2017-2018)
- Integration testing with rec_with_table()

Published: https://dmanuel.quarto.pub/cep-004-hearing-and-vision-variables
@DougManuel
Copy link
Copy Markdown
Contributor

Summary of review

Hearing variables (from your PR):

HUI06, HUI07, HUI07A, HUI08, HUI09 - individual hearing items
HUIGHER, HUIGVIS - grouped/derived variables

Vision variables (added):

  • Shouldn't have done, but reviewed these at the same time

HUI01, HUI02, HUI03, HUI04, HUI05 - individual vision items
Fixes applied to worksheets

Identified and corrected 6 issues in the original PR

  • HUI06 databaseStart typo cchs_2009_2010_m changed to cchs2009_2010_m
  • HUI06 variableStart typo HUAC_06 changed to HUIC_06
  • HUI09 uses wrong source variable [HUI_08] changed to [HUI_09]
  • All HUI06-09 rows have underscore typo in database names - fixed across all rows
  • HUI07A missing 2005 and 2007-2008 cycles - added missing cycles
  • HUI09 missing 2011-2012 cycle - added cycle

Analyses of availability for provinces, with a focus on Ontario older people, to support the study in progress.

During validation, we discovered that the 2003 PUMF has NO Ontario data for HUI hearing/vision variables (HUICGHER, HUICGVIS). The 2003 cycle only contains data from Atlantic provinces and Quebec. This is a data collection artefact, not a worksheet error, but it's important for users planning Ontario-based analyses.

Documentation

Published detailed documentation at:
https://dmanuel.quarto.pub/cep-004-hearing-and-vision-variables

This includes:

  • Variable concordance across CCHS cycles
  • Semantic mapping between HUI (6-point scale) and Washington Group (4-point scale)
  • Integration testing with rec_with_table()
  • Ontario availability matrices by age group

Next steps

Ready for merge. Future iterations may add:

  • Standalone WDM_005 (vision) and WDM_010 (hearing) variables for 2017-2018
  • HUIGHER_cat4 and HUIGVIS_cat4 derived variables for cross-era harmonisation

Copy link
Copy Markdown
Contributor

@DougManuel DougManuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

OK to merge after others have reviewed. @rafdoodle

@DougManuel DougManuel deleted the branch feature/v3.0.0-validation-infrastructure January 27, 2026 16:02
@DougManuel DougManuel closed this Jan 27, 2026
Removed exports for functions that don't exist:
- assess_binge_drinking
- assess_drinking_risk_long
- assess_drinking_risk_short
- categorize_bmi

These exports were causing R CMD check to fail with
"undefined exports" errors.
@rafdoodle rafdoodle reopened this Apr 1, 2026
Gem-verified corrections to all 5 hearing variables (HUI06-HUI09, HUI07A):

Coverage expansion (25 → 74 variable_details rows):
- Add cchs2005_m (HUIE_*) and cchs2007_2008_m (HUI_*) to all variables
- Add cchs2015_2016_m and cchs2019_2020_m with renumbered HUI_030-050
- Add cchs2022_m (WDM_10) to HUI07A
- Add cchs2023_m (HUIHEAR1/HUIHEAR2) to HUI08/HUI09

Semantic fixes:
- Fix WDM_101 → WDM_010 (typo; WDM_101 does not exist)
- Fix [7,9] → [7,8,9] (code 8=Refusal was missing from all variables)
- Era-split HUI07A into 4 blocks: pre-2015 binary, post-2015 binary,
  WDM_010 (2017-2018), WDM_10 (2022) with correct 4-point→binary
  mapping (1,2,3→1 able; 4→2 unable)

Variables.csv updated with expanded databaseStart/variableStart for all 5.
Gem verification prompts (v1 and v2) and extracted hearing variable
CSVs used for three-way triangulation during PR #160 review.
@DougManuel
Copy link
Copy Markdown
Contributor

Code review

Reviewed 5 hearing variables (HUI06, HUI07, HUI07A, HUI08, HUI09) for Master across all available cycles (2001–2023).

Verification method

Three-way triangulation: cchs-metadata DuckDB → Google NotebookLM Gem (against ~239 StatCan PDFs) → implementation. Two rounds of Gem verification (v1 + v2, 8 specific items). All findings confirmed before implementation.

Issues found and fixed

Coverage gaps (Gem-confirmed):

  1. cchs2005_m (HUIE_) and cchs2007_2008_m (HUI_) missing from HUI06–HUI09
  2. cchs2015_2016_m and cchs2019_2020_m missing — HUI questions renumbered to HUI_030–050 in these cycles
  3. cchs2022_m (WDM_10) missing from HUI07A
  4. cchs2023_m (HUIHEAR1/HUIHEAR2) missing from HUI08/HUI09

Semantic fixes (Gem-confirmed):
5. WDM_101WDM_010 — WDM_101 does not exist in any data dictionary
6. [7,9][7,8,9] — code 8 (Refusal) confirmed present in all HUI hearing questions
7. HUI07A WDM blocks needed era-split with correct 4-point→binary mapping: "some difficulty" and "a lot of difficulty" = still able to hear (→1), only "cannot do at all" = unable (→2)

Result: 25 → 74 variable_details rows. HUI07A has 4 blocks (pre-2015 binary, post-2015 binary, WDM_010, WDM_10). HUI08/HUI09 have 3 blocks each (pre-2015, post-2015, 2023).

Checked: era boundary defaults, databaseStart consistency, naming conventions, known error patterns, WDM semantic mapping.

CEP: ceps/cep-004-hearing/

@DougManuel
Copy link
Copy Markdown
Contributor

Reviewed and ready to merge. @rafidoole for final checks.

Commits added:

  • Coverage expansion (25→74 rows): Added cchs2005_m, cchs2007_2008_m, cchs2015_2016_m, cchs2019_2020_m, cchs2022_m, cchs2023_m across all 5 hearing variables
  • Semantic fixes: WDM_101→WDM_010 (typo), [7,9][7,8,9] (missing refusal code), HUI07A era-split for correct WDM 4-point→binary mapping
  • CEP-004 artifacts: Gem verification prompts and extracted CSVs

All changes verified against StatCan PDFs via two rounds of NotebookLM Gem review.

@rafdoodle rafdoodle changed the title hearing hearing and vision Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@rafdoodle rafdoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final HUI Hearing and Vision Variable Changes (v3)

This comment summarizes all changes made to Health Utility Index (HUI) variables across variables.csv and variable_details.csv.


Overview

Variable Type Change
HUI_01–HUI_05 Vision (individual) New — added to both worksheets
HUI_06–HUI_09, HUI_07A Hearing (individual) Renamed + databaseStart/variableStart updated
HUIGHER Hearing derived Block 2 rebuilt; block 1 cleaned
HUIGVIS Vision derived Block 2 rebuilt; block 1 cleaned

HUI_01–HUI_05 — Vision Individual Variables (New)

Five new vision variables added, mirroring the structure of the existing HUI_06–09 hearing variables.

Variable Description Cycles (databaseStart)
HUI_01 Read newsprint without glasses/contacts cchs2001–2019_2020_m (8 cycles excl. 2011-2012, 2017-2018, 2022-2023)
HUI_02 Read newsprint with glasses/contacts Same as HUI_01
HUI_03 Able to see at all Same + WDM era-split for 2017-2018 (WDM_005) and 2022 (WDM_05)
HUI_04 Recognize friend without glasses/contacts Same as HUI_01 + cchs2023_m (HUIVIS1)
HUI_05 Recognize friend with glasses/contacts Same as HUI_04 + cchs2011_2012_m, cchs2012_m

Source variable mapping across cycles:

Cycle range Source variable prefix
cchs2001_m HUIA_0X
cchs2003_m HUIC_0X
cchs2005_m HUIE_0X
cchs2007_2008_m – cchs2013_2014_m HUI_0X (default)
cchs2009_m, cchs2010_m HUI_0X (default)
cchs2015_2016_m, cchs2019_2020_m HUI_005–HUI_025 (renumbered)
cchs2017_2018_m (HUI_03 only) WDM_005 (4-pt scale → binary)
cchs2022_m (HUI_03 only) WDM_05 (4-pt scale → binary)
cchs2023_m (HUI_04) HUIVIS1
cchs2023_m (HUI_05) HUIVIS2

variable_details.csv structure:

  • HUI_01, HUI_02: 5 rows (single merged block, binary recoding)
  • HUI_03: 19 rows (main block 5 rows + WDM 2017-2018 block 7 rows + WDM 2022 block 7 rows)
  • HUI_04, HUI_05: 5 rows (single merged block including 2023)

Notes:

  • WDM_005/WDM_05 use a 4-point severity scale recoded to binary: values 1–3 → Yes (able), 4 → No (unable)
  • version = 3.0.0, lastUpdated = 2026-04-10 set on all HUI_01–05 rows
  • variableStartLabel set for all five variables

HUI_06–HUI_09, HUI_07A — Hearing Individual Variables (Updated)

Previously named HUI06, HUI07, HUI07A, HUI08, HUI09 — renamed to HUI_06, HUI_07, HUI_07A, HUI_08, HUI_09.

databaseStart changes:

Variable Cycles added Cycles removed
HUI_06, HUI_07 cchs2005_m, cchs2007_2008_m, cchs2015_2016_m, cchs2019_2020_m, cchs2009_m, cchs2010_m cchs2011_2012_m
HUI_07A cchs2005_m, cchs2007_2008_m, cchs2015_2016_m, cchs2019_2020_m, cchs2017_2018_m, cchs2022_m, cchs2009_m, cchs2010_m cchs2011_2012_m
HUI_08 cchs2005_m, cchs2007_2008_m, cchs2015_2016_m, cchs2019_2020_m, cchs2023_m, cchs2009_m, cchs2010_m cchs2011_2012_m
HUI_09 cchs2005_m, cchs2007_2008_m, cchs2015_2016_m, cchs2019_2020_m, cchs2023_m, cchs2009_m, cchs2010_m, cchs2012_m

Coverage gaps confirmed by Doug: cchs2005_m (HUIE_ prefix) and cchs2007_2008_m were previously missing. cchs2015_2016_m and cchs2019_2020_m were missing because HUI questions were renumbered to HUI_030–HUI_050 in those cycles. cchs2011_2012_m was confirmed absent for HUI_06–08 (present only in HUI_09).

Source variable mapping (new cycles):

Cycle Source variable Applies to
cchs2005_m HUIE_0X All
cchs2015_2016_m, cchs2019_2020_m HUI_030–HUI_050 (renumbered) All
cchs2017_2018_m WDM_010 (4-pt → binary) HUI_07A only
cchs2022_m WDM_10 (4-pt → binary) HUI_07A only
cchs2023_m HUIHEAR1 HUI_08
cchs2023_m HUIHEAR2 HUI_09

variable_details.csv structure after consolidation:

  • HUI_06, HUI_07: 5 rows (single block — 2015-2016/2019-2020 merged into main block)
  • HUI_07A: 19 rows (main block 5 rows + WDM 2017-2018 block 7 rows + WDM 2022 block 7 rows)
  • HUI_08: 5 rows (single block including 2023)
  • HUI_09: 5 rows (single block including 2011-2012 and 2023)

Other fixes:

  • variableStartLabel corrected for HUI_07, HUI_07A, HUI_08, HUI_09 (previously all inherited HUI_06's label)
  • Labels updated to HUI Hearing ability - {description} format in variables.csv

HUIGHER — Hearing Derived Variable (Updated)

variable_details.csv:

Block Before After
Block 1 (_p cycles) Included cchs2013_2014_i, cchs2012_p cchs2013_2014_i removed; cchs2012_p removed
Block 2 (_m cycles) cchs2009_s, cchs2010_s, cchs2012_s, cchs2009_2010_i (old _s/_i suffix) Replaced with full _m cycle list

Block 2 new databaseStart: cchs2001_m, cchs2003_m, cchs2005_m, cchs2007_2008_m, cchs2009_2010_m, cchs2009_m, cchs2010_m, cchs2013_2014_m, cchs2015_2016_m, cchs2019_2020_m, cchs2023_m

Block 2 new variableStart: cchs2001_m::HUIADHER, cchs2003_m::HUICDHER, cchs2005_m::HUIEDHER, cchs2015_2016_m::HUIDVHER, cchs2019_2020_m::HUIDVHER, cchs2023_m::HUIDVHER, [HUIDHER]


HUIGVIS — Vision Derived Variable (Updated)

Same fixes as HUIGHER, applied symmetrically for vision.

variable_details.csv:

Block Before After
Block 1 (_p cycles) Included cchs2012_p cchs2012_p removed
Block 2 (_m cycles) cchs2009_s, cchs2010_s, cchs2012_s (old _s suffix) Replaced with full _m cycle list

Block 2 new databaseStart: cchs2001_m, cchs2003_m, cchs2005_m, cchs2007_2008_m, cchs2009_2010_m, cchs2009_m, cchs2010_m, cchs2013_2014_m, cchs2015_2016_m, cchs2019_2020_m, cchs2023_m

Block 2 new variableStart: cchs2001_m::HUIADVIS, cchs2003_m::HUICDVIS, cchs2005_m::HUIEDVIS, cchs2015_2016_m::HUIDVVIS, cchs2019_2020_m::HUIDVVIS, cchs2023_m::HUIDVVIS, [HUIDVIS]


General Convention Applied

Rule Action
cchs2009_2010_m present cchs2009_m and cchs2010_m also added
cchs2011_2012_m present cchs2012_m also added
cchs2012_p present without cchs2011_2012_p cchs2012_p removed
Redundant era-split blocks with identical recoding Collapsed into single block using cycle::SOURCE notation in variableStart

rafdoodle added a commit that referenced this pull request Apr 11, 2026
@rafdoodle
Copy link
Copy Markdown
Collaborator

rafdoodle commented Apr 11, 2026

Changes manually merged to v3 via commit 1033a21. Will close this PR now.

@rafdoodle rafdoodle closed this Apr 11, 2026
@rafdoodle rafdoodle deleted the hearing branch April 11, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants