feat: modernize design patterns to align with network_wrangler#11
Open
feat: modernize design patterns to align with network_wrangler#11
Conversation
…#9) - CI/CD: full GitHub Actions pipeline (push, pullrequest, prepare-release, publish, clean-docs) with 5 reusable composite actions; Python 3.10–3.13 test matrix; benchmark comparison and coverage comments on PRs - Parameters: rewrite as pydantic dataclass hierarchy (TimePeriodsConfig, CategoriesConfig, Parameters); Path objects throughout; __post_init__ for derived path attributes - Pandera schemas: CubeLinksTable and CubeNodesTable DataFrameModels as single source of truth for column types; coerce_df_to_model() utility replaces manual astype() loops and convert_int/convert_bool/fill_na functions - Python 3.10+ type hints: from __future__ import annotations in all source modules; remove typing.Optional/Dict/List/Union in favour of X | Y syntax - pathlib: replace all os.path.join/splitext/basename with Path operations in project.py, roadway.py, and transit.py - pandas 3+: replace all inplace=True calls with assignment-style operations across roadway.py and project.py - fillna logic now derives fill value from actual column dtype (bool→False, object→"", numeric→0) instead of hardcoded parameter lists - Docs: uv-based install, development guide, API reference via mkdocstrings; remove conda references - Tests: session-scoped fixtures in conftest.py; benchmark test suite Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 tasks
- RUF002: replace EN DASH with hyphen-minus in docstrings (tables.py, parameters.py) - D100/D102/D103/D107: add missing module and method docstrings (project.py, roadway.py, util.py, transit.py) - D205/D417/D419: fix docstring formatting (blank lines, missing Args entries, empty docstrings) - RUF012: annotate mutable class attributes with ClassVar in Project - PTH123: replace open() with Path.open() in project.py, roadway.py, transit.py - PLR2004: extract magic value 20 as _SMALL_RECS constant in utils/models.py - B018: remove useless expression in project.py - PLC0206: use .items() when iterating dict in roadway.py - B023: fix lambda loop-variable capture with default argument pattern in roadway.py - SIM102: collapse nested if statements in project.py - PLR0912/PLR0915: add to global ignore (pre-existing large legacy functions) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaced parameters.bool_col reference (removed when NetworkColumnsConfig was eliminated) with pd.api.types.is_bool_dtype() inspection on the actual DataFrame, consistent with the pattern used in project.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix NameError in Project.__init__: `self.card_data = Dict[str, ...]`
was a type annotation used as a runtime value; corrected to `= {}`
- Fix bug in util.column_name_to_parts: `parameters.categories.keys()`
failed at runtime (CategoriesConfig is a dataclass, not a dict);
corrected to `parameters.categories.as_dict().keys()`
- Add tests/test_parameters.py: 17 tests covering Parameters defaults,
time_period_to_time, categories, properties_to_split, path shortcuts,
custom base_dir, and nested config classes
- Add tests/test_util.py: covers get_shared_streets_intersection_hash,
hhmmss_to_datetime, secs_to_datetime, shorten_name, column_name_to_parts
- Add tests/test_models.py: covers coerce_df_to_model with CubeLinksTable
and CubeNodesTable (int/bool/float coercion, extra cols, attrs, errors)
- Add tests/test_project.py: covers Project.read_logfile (single file,
list of files, column bracket-suffix stripping) and Project instantiation
- Add tests/test_transit.py: covers StandardTransit instantiation and
time_to_cube_time_period for all time periods using mock feed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Prefix unused unpacked variables in project.py with _ (RUF059) - Inline unnecessary lambda wrappers: lambda x: list(x) → list, lambda x: _final_op(x) → _final_op, lambda x: str(x) → str (PLW0108) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove improved_process_link_changes from tests/utils/link_changes.py; prototype improvements belong in a separate issue, not the benchmark baseline - Remove _parse_change_df from test_benchmark.py; it reimplemented application logic from project.py. Replaced with a simple fixture that calls Project.read_logfile and filters to link rows directly - Remove TestBenchmarkPropForScope; it tests network_wrangler internals (prop_for_scope requires a fully-loaded RoadwayNetwork.links_df, not raw JSON test data) - noted in PERFORMANCE_ANALYSIS.md instead - Fix RUF002 (ambiguous multiplication sign) in link_changes.py docstring - Fix PTH123 (open -> Path.open) in tests/utils/logfile.py - Add PERFORMANCE_ANALYSIS.md documenting the three bottlenecks and proposed fixes for a future improvement issue Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add !tests/data/*.log exception to .gitignore so pre-generated log files are tracked; CI was failing because it tried to regenerate them but the generator used columns (distance, lanes) absent in the stpaul network - Fix LOG_LINK_COLS in tests/utils/logfile.py: replace distance (not present) and lanes (always null/scoped) with length, which is present on all links - Fix _CHANGES: replace lanes increment with drive_access boolean toggle - Simplify load_usable_links: require all LOG_LINK_COLS to be non-null; add explicit error if usable_links is empty to catch this early - Regenerate all five log files (10/50/100/500/1000 rows) with corrected generator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Contributor
Benchmark ComparisonComparing PR branch to base branch No previous benchmark found in base branch. This will serve as the baseline. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #9
Summary
Full modernization of cube_wrangler to match network_wrangler's design patterns and requirements.
** DOES NOT change any public APIs, performance or functionality ** but this was stuff that needed to be updated before we did that.
CI/CD pipeline
push.yml,pullrequest.yml,prepare-release.yml,publish.yml,clean-docs.ymlsetup-python-uv,get-branch-name,build-docs,compare-benchmarks,post-coverageinstall.ymlanddocs.ymlPydantic dataclass-based Parameters
TimePeriodsConfig,CategoriesConfig, and top-levelParameterspydantic dataclassesPathobjects throughout (no more strings for file paths);__post_init__derives all path attributes frombase_dirPandera schemas as single source of truth for column types
CubeLinksTableandCubeNodesTable(cube_wrangler/models/tables.py) replace the oldbool_col/int_col/float_col/string_collists in Parameterscoerce_df_to_model()utility (cube_wrangler/utils/models.py) applies schemas with coercion — mirrorsvalidate_df_to_modelpattern from network_wranglerconvert_int(),convert_bool(),fill_na()inroadway.pyreplaced by a singleconvert_types()call; old functions kept as deprecated aliasesfillnadefaults now derived from actual column dtype (bool→False,object→"",numeric→0) — no hardcoded lists neededPython 3.10+ type hints
from __future__ import annotationsadded to all source modulesfrom typing import Optional, Dict, List, Unioninproject.py,roadway.py,transit.pypathlibnotosos.path.join(),os.path.splitext(),os.path.basename()calls inproject.py,roadway.py, andtransit.pywithPathequivalentsimport osremoved from all three modulesPandas 3+ safe
inplace=Truecalls with assignment-style operations acrossroadway.py(8 instances) andproject.py(4 instances)Docs
docs/index.md: uv-first install instructions, quick start, ecosystem overviewdocs/development.md: setup, testing, benchmarks, versioning/release guidedocs/api.md: auto-generated API reference via mkdocstrings for all modulesmkdocs.yml: updated with include-markdown plugin and Development nav entryREADME.md: removed conda, updated to Python 3.10+, uv-firstTests & benchmarks
tests/conftest.py: session-scoped fixtures mirroring network_wrangler patterntests/test_benchmark.py: benchmark suite with 3 classes covering log parsing, link changes, and property scopingbenchmarks/: standalone benchmark scripts and performance analysisTest plan
uv run pytest -m "not benchmark"passes locally (3/3 non-benchmark tests pass)🤖 Generated with Claude Code