Skip to content

feat: modernize design patterns to align with network_wrangler#11

Open
e-lo wants to merge 7 commits intomainfrom
feature/9-modernize-design-patterns
Open

feat: modernize design patterns to align with network_wrangler#11
e-lo wants to merge 7 commits intomainfrom
feature/9-modernize-design-patterns

Conversation

@e-lo
Copy link
Copy Markdown

@e-lo e-lo commented Mar 18, 2026

Closes #9

Summary

Full modernization of cube_wrangler to match network_wrangler's design patterns and requirements.

** DOES NOT change any public APIs, performance or functionality ** but this was stuff that needed to be updated before we did that.

CI/CD pipeline

  • 5 GitHub Actions workflows: push.yml, pullrequest.yml, prepare-release.yml, publish.yml, clean-docs.yml
  • 5 reusable composite actions: setup-python-uv, get-branch-name, build-docs, compare-benchmarks, post-coverage
  • Python 3.10–3.13 test matrix; benchmark comparison and coverage comment on PRs; OIDC trusted publishing to PyPI
  • Removed the old broken install.yml and docs.yml

Pydantic dataclass-based Parameters

  • Replaced flat class with TimePeriodsConfig, CategoriesConfig, and top-level Parameters pydantic dataclasses
  • Path objects throughout (no more strings for file paths); __post_init__ derives all path attributes from base_dir

Pandera schemas as single source of truth for column types

  • CubeLinksTable and CubeNodesTable (cube_wrangler/models/tables.py) replace the old bool_col/int_col/float_col/string_col lists in Parameters
  • coerce_df_to_model() utility (cube_wrangler/utils/models.py) applies schemas with coercion — mirrors validate_df_to_model pattern from network_wrangler
  • convert_int(), convert_bool(), fill_na() in roadway.py replaced by a single convert_types() call; old functions kept as deprecated aliases
  • fillna defaults now derived from actual column dtype (bool→False, object→"", numeric→0) — no hardcoded lists needed

Python 3.10+ type hints

  • from __future__ import annotations added to all source modules
  • Removed from typing import Optional, Dict, List, Union in project.py, roadway.py, transit.py

pathlib not os

  • Replaced all os.path.join(), os.path.splitext(), os.path.basename() calls in project.py, roadway.py, and transit.py with Path equivalents
  • import os removed from all three modules

Pandas 3+ safe

  • Replaced all inplace=True calls with assignment-style operations across roadway.py (8 instances) and project.py (4 instances)

Docs

  • docs/index.md: uv-first install instructions, quick start, ecosystem overview
  • docs/development.md: setup, testing, benchmarks, versioning/release guide
  • docs/api.md: auto-generated API reference via mkdocstrings for all modules
  • mkdocs.yml: updated with include-markdown plugin and Development nav entry
  • README.md: removed conda, updated to Python 3.10+, uv-first

Tests & benchmarks

  • tests/conftest.py: session-scoped fixtures mirroring network_wrangler pattern
  • tests/test_benchmark.py: benchmark suite with 3 classes covering log parsing, link changes, and property scoping
  • benchmarks/: standalone benchmark scripts and performance analysis

Test plan

  • uv run pytest -m "not benchmark" passes locally (3/3 non-benchmark tests pass)
  • CI matrix (Python 3.10–3.13) passes on push
  • Benchmark comparison runs on PR against main

🤖 Generated with Claude Code

…#9)

- CI/CD: full GitHub Actions pipeline (push, pullrequest, prepare-release,
  publish, clean-docs) with 5 reusable composite actions; Python 3.10–3.13
  test matrix; benchmark comparison and coverage comments on PRs
- Parameters: rewrite as pydantic dataclass hierarchy (TimePeriodsConfig,
  CategoriesConfig, Parameters); Path objects throughout; __post_init__ for
  derived path attributes
- Pandera schemas: CubeLinksTable and CubeNodesTable DataFrameModels as single
  source of truth for column types; coerce_df_to_model() utility replaces
  manual astype() loops and convert_int/convert_bool/fill_na functions
- Python 3.10+ type hints: from __future__ import annotations in all source
  modules; remove typing.Optional/Dict/List/Union in favour of X | Y syntax
- pathlib: replace all os.path.join/splitext/basename with Path operations
  in project.py, roadway.py, and transit.py
- pandas 3+: replace all inplace=True calls with assignment-style operations
  across roadway.py and project.py
- fillna logic now derives fill value from actual column dtype (bool→False,
  object→"", numeric→0) instead of hardcoded parameter lists
- Docs: uv-based install, development guide, API reference via mkdocstrings;
  remove conda references
- Tests: session-scoped fixtures in conftest.py; benchmark test suite

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
e-lo and others added 6 commits March 19, 2026 09:42
- RUF002: replace EN DASH with hyphen-minus in docstrings (tables.py, parameters.py)
- D100/D102/D103/D107: add missing module and method docstrings (project.py,
  roadway.py, util.py, transit.py)
- D205/D417/D419: fix docstring formatting (blank lines, missing Args entries,
  empty docstrings)
- RUF012: annotate mutable class attributes with ClassVar in Project
- PTH123: replace open() with Path.open() in project.py, roadway.py, transit.py
- PLR2004: extract magic value 20 as _SMALL_RECS constant in utils/models.py
- B018: remove useless expression in project.py
- PLC0206: use .items() when iterating dict in roadway.py
- B023: fix lambda loop-variable capture with default argument pattern in roadway.py
- SIM102: collapse nested if statements in project.py
- PLR0912/PLR0915: add to global ignore (pre-existing large legacy functions)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaced parameters.bool_col reference (removed when NetworkColumnsConfig
was eliminated) with pd.api.types.is_bool_dtype() inspection on the actual
DataFrame, consistent with the pattern used in project.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix NameError in Project.__init__: `self.card_data = Dict[str, ...]`
  was a type annotation used as a runtime value; corrected to `= {}`
- Fix bug in util.column_name_to_parts: `parameters.categories.keys()`
  failed at runtime (CategoriesConfig is a dataclass, not a dict);
  corrected to `parameters.categories.as_dict().keys()`
- Add tests/test_parameters.py: 17 tests covering Parameters defaults,
  time_period_to_time, categories, properties_to_split, path shortcuts,
  custom base_dir, and nested config classes
- Add tests/test_util.py: covers get_shared_streets_intersection_hash,
  hhmmss_to_datetime, secs_to_datetime, shorten_name, column_name_to_parts
- Add tests/test_models.py: covers coerce_df_to_model with CubeLinksTable
  and CubeNodesTable (int/bool/float coercion, extra cols, attrs, errors)
- Add tests/test_project.py: covers Project.read_logfile (single file,
  list of files, column bracket-suffix stripping) and Project instantiation
- Add tests/test_transit.py: covers StandardTransit instantiation and
  time_to_cube_time_period for all time periods using mock feed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Prefix unused unpacked variables in project.py with _ (RUF059)
- Inline unnecessary lambda wrappers: lambda x: list(x) → list,
  lambda x: _final_op(x) → _final_op, lambda x: str(x) → str (PLW0108)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove improved_process_link_changes from tests/utils/link_changes.py;
  prototype improvements belong in a separate issue, not the benchmark baseline
- Remove _parse_change_df from test_benchmark.py; it reimplemented application
  logic from project.py. Replaced with a simple fixture that calls
  Project.read_logfile and filters to link rows directly
- Remove TestBenchmarkPropForScope; it tests network_wrangler internals
  (prop_for_scope requires a fully-loaded RoadwayNetwork.links_df, not raw
  JSON test data) - noted in PERFORMANCE_ANALYSIS.md instead
- Fix RUF002 (ambiguous multiplication sign) in link_changes.py docstring
- Fix PTH123 (open -> Path.open) in tests/utils/logfile.py
- Add PERFORMANCE_ANALYSIS.md documenting the three bottlenecks and
  proposed fixes for a future improvement issue

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add !tests/data/*.log exception to .gitignore so pre-generated log files
  are tracked; CI was failing because it tried to regenerate them but the
  generator used columns (distance, lanes) absent in the stpaul network
- Fix LOG_LINK_COLS in tests/utils/logfile.py: replace distance (not present)
  and lanes (always null/scoped) with length, which is present on all links
- Fix _CHANGES: replace lanes increment with drive_access boolean toggle
- Simplify load_usable_links: require all LOG_LINK_COLS to be non-null;
  add explicit error if usable_links is empty to catch this early
- Regenerate all five log files (10/50/100/500/1000 rows) with corrected generator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Coverage

Coverage Report
FileStmtsMissCoverMissing
__init__.py50100% 
logger.py211719%23, 26, 28–31, 34, 36–39, 42, 44–48
methods.py220%4, 6
parameters.py56296%240, 245
project.py43935918%123–125, 128, 136, 147, 192–199, 201–203, 205–210, 212–214, 216–217, 220–223, 226–229, 232–235, 238–241, 244–247, 250–255, 257–269, 271–273, 275–281, 286, 289–290, 294–296, 298, 369, 374–375, 377–378, 380–381, 386, 389, 393, 397–398, 402, 406, 413, 415–418, 423, 427, 431–432, 440, 446–450, 457–458, 460, 462–463, 465, 468, 470, 480, 490–492, 495, 497, 499–510, 512, 514, 516, 520–521, 527–529, 534, 536–537, 539, 541, 543–547, 556, 560, 565–566, 568, 572, 574, 578, 580, 584, 586–587, 590–591, 597, 601, 603, 605–607, 609, 616–617, 619–626, 628–630, 632, 634–636, 639, 641, 643, 645, 647–649, 651–653, 660–662, 664, 667, 669, 671, 674, 679–682, 684–685, 689–690, 692, 696–698, 700, 704–705, 710–711, 715–717, 720–721, 725, 727, 729, 731, 733–734, 737–738, 741–742, 744–746, 750, 752, 760–767, 769–771, 773, 777–778, 781–784, 787–793, 797–798, 801–807, 809–811, 813–815, 817, 826, 828–831, 833, 835–837, 841–843, 848, 850, 858, 863, 875, 880–882, 885–886, 892, 894–895, 897–898, 900, 902–903, 905, 912–915, 917–919, 921–922, 927, 930, 935–937, 942, 946–947, 951–956, 958–959, 961, 963–964, 969–973, 975–976, 980, 990
roadway.py3252949%43, 45, 47–48, 50–52, 57–58, 61–66, 69, 77–79, 86–87, 89, 91, 106, 108–110, 114, 117, 119, 123–125, 127–128, 130, 136–137, 139, 160–162, 166, 169, 171, 175, 178–180, 182, 185, 187–189, 191–193, 199–200, 202, 204–205, 207, 227, 229–231, 234, 236, 239, 241, 245–246, 248, 268, 270–272, 276, 279, 281, 285, 287–288, 290, 310, 312–314, 318, 321, 323, 327, 329–330, 332, 357, 359, 363, 365, 367, 369–372, 374–377, 379, 386, 388–389, 393, 395, 399, 403, 405, 410, 412, 418–420, 422, 442–445, 460–461, 471–472, 482–483, 506, 508, 512, 516, 518, 522–524, 526, 528, 532–538, 540–544, 546, 548, 582–583, 587, 591, 594, 598, 604, 611, 615, 617, 619, 621, 623, 627–628, 633–634, 640, 642–644, 646–649, 651–652, 655–656, 696, 700, 703, 707, 709, 715, 721, 723, 725, 731, 737, 743, 748–749, 751–752, 755, 760, 766, 770–771, 773, 775, 778–780, 783, 786–787, 789–790, 793–794, 796, 799, 801–802, 804, 807–809, 812–816, 818–819, 821, 823, 825–830, 832–833, 835, 837, 839–842, 844–853, 855–856, 872, 875, 882–884, 886–891, 893, 897–900, 902, 922–927, 953, 956, 959, 970, 979, 985, 991, 995
transit.py1134361%58, 86–88, 106–108, 120–121, 126–127, 129–130, 184–185, 187–188, 191–192, 194, 213–214, 221–222, 228, 231, 238, 240, 243, 248, 255, 259, 261–263, 267–270, 274, 280–282
util.py55885%68, 70, 82, 93–94, 99–100, 119
models
   __init__.py20100% 
   tables.py760100% 
utils
   __init__.py20100% 
   models.py32778%26, 69–70, 73–74, 80–81
TOTAL112873235% 

Tests Skipped Failures Errors Time
68 0 💤 0 ❌ 0 🔥 4.405s ⏱️

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Comparison

Comparing PR branch to base branch main

No previous benchmark found in base branch. This will serve as the baseline.

@e-lo e-lo requested a review from i-am-sijia March 20, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update design patterns and requirements to be consistent with Network Wrangler

1 participant