Skip to content

feature - add builder-based dataset semantics and split Substrait helpers (#23)#26

Merged
dannymeijer merged 1 commit into
mainfrom
feature/23-replace-aggregate-helper-stubs-real-semantics
Apr 23, 2026
Merged

feature - add builder-based dataset semantics and split Substrait helpers (#23)#26
dannymeijer merged 1 commit into
mainfrom
feature/23-replace-aggregate-helper-stubs-real-semantics

Conversation

@dannymeijer
Copy link
Copy Markdown
Collaborator

Summary

This PR turns the current InQL relational slice into a real builder-backed package surface instead of placeholder stubs. It adds explicit aggregate, filter, and projection builder modules; introduces structured DataFrame materialization metadata instead of reparsing plain-text payloads; and replaces the old monolithic substrait/plan.incn with focused Substrait modules for schema registry, expression lowering, relations, plans, inspection, and extensions. It also aligns the docs, examples, RFCs, and tests with that surface so the package reads as an intentional showcase rather than a partially stubbed scaffold.

Type of change

  • Bug fix
  • New feature
  • Refactor / maintenance
  • Documentation
  • CI / tooling
  • RFC (adds/updates docs/rfcs/*)

Area(s)

Select the primary areas touched (labels sync from checked lines when the triage workflow runs):

  • Package & tests
  • Specification (RFCs)
  • Documentation
  • Automation & repo config
  • Other

Key details

  • User-facing behavior: Authors now get explicit public builder surfaces for filters, projections, and aggregates; Session.collect() returns DataFrame carriers backed by structured materialization metadata (resolved_columns, row_count, preview text); grouped aggregate and computed-column example flows are documented and tested; and the language/docs now present dataset_carriers instead of the older dataset_types framing.
  • Internals: The Substrait layer is split by responsibility into schema_registry, expr_lowering, extensions, relations, plans, and inspect; aggregate/filter/project lowering now flows through real builder types; Prism output/rewrite/store helpers were cleaned up for readability; and the DataFusion backend now returns structured collection materialization instead of a text payload contract.
  • Risks: This changes a wide swath of package-facing surface area at once, especially around builder semantics and Substrait module ownership. The highest-risk areas are plan emission correctness, aggregate/projection lowering parity, and docs/examples drifting from the implemented API, though the branch adds focused coverage for those paths.

Testing / verification

  • make ci (or make fmt-check, make build, make test)
  • Manual verification described below

Manual verification notes:

  • Ran make -C /Users/danny/Development/encero/InQL ci
  • Re-ran focused checks while addressing review findings:
    • incan test /Users/danny/Development/encero/InQL/tests/test_session.incn
    • incan test /Users/danny/Development/encero/InQL/tests/test_substrait_plan.incn
    • make -C /Users/danny/Development/encero/InQL build
  • Added Substrait regression coverage to ensure plans that emit scalar/aggregate function declarations also register the shared function extension URN.

Docs impact

  • No docs changes needed
  • Docs updated
  • Docs follow Divio intent (tutorial/how-to/reference/explanation) where applicable

If docs updated:

  • Link(s):
    • docs/language/explanation/dataset_carriers.md
    • docs/language/reference/dataset_carriers.md
    • docs/language/reference/builders/aggregates.md
    • docs/language/reference/builders/filters.md
    • docs/language/reference/builders/projections.md
    • docs/language/reference/dataset_methods.md
    • docs/language/reference/functions/index.md
    • docs/architecture.md
    • docs/release_notes/v0_1.md
    • docs/rfcs/001_inql_dataset.md
    • docs/rfcs/012_unified_scalar_expression_surface.md

Checklist

  • I kept public docs user-focused and moved internals to contributing docs when appropriate
  • I avoided duplicating canonical install/run instructions in multiple places
  • I added/updated tests where it materially reduces regressions

Closes #23

@dannymeijer dannymeijer self-assigned this Apr 23, 2026
@incan-triage-bot incan-triage-bot Bot added automation CI, Makefile, .github/, repo config documentation Improvements or additions to documentation package Library source, tests, incan.toml specification docs/rfcs/ normative RFCs labels Apr 23, 2026
@dannymeijer
Copy link
Copy Markdown
Collaborator Author

ignoring failing ci as its running against wrong incan version.

@dannymeijer dannymeijer merged commit efb2b8c into main Apr 23, 2026
1 of 3 checks passed
@dannymeijer dannymeijer deleted the feature/23-replace-aggregate-helper-stubs-real-semantics branch April 23, 2026 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation CI, Makefile, .github/, repo config documentation Improvements or additions to documentation package Library source, tests, incan.toml specification docs/rfcs/ normative RFCs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature - replace aggregate helper stubs (total, count_rows) with real aggregate semantics

1 participant