feature - add builder-based dataset semantics and split Substrait helpers (#23)#26
Merged
dannymeijer merged 1 commit intoApr 23, 2026
Conversation
Collaborator
Author
|
ignoring failing ci as its running against wrong incan version. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR turns the current InQL relational slice into a real builder-backed package surface instead of placeholder stubs. It adds explicit aggregate, filter, and projection builder modules; introduces structured
DataFramematerialization metadata instead of reparsing plain-text payloads; and replaces the old monolithicsubstrait/plan.incnwith focused Substrait modules for schema registry, expression lowering, relations, plans, inspection, and extensions. It also aligns the docs, examples, RFCs, and tests with that surface so the package reads as an intentional showcase rather than a partially stubbed scaffold.Type of change
docs/rfcs/*)Area(s)
Select the primary areas touched (labels sync from checked lines when the triage workflow runs):
Key details
Session.collect()returnsDataFramecarriers backed by structured materialization metadata (resolved_columns,row_count, preview text); grouped aggregate and computed-column example flows are documented and tested; and the language/docs now presentdataset_carriersinstead of the olderdataset_typesframing.schema_registry,expr_lowering,extensions,relations,plans, andinspect; aggregate/filter/project lowering now flows through real builder types; Prism output/rewrite/store helpers were cleaned up for readability; and the DataFusion backend now returns structured collection materialization instead of a text payload contract.Testing / verification
make ci(ormake fmt-check,make build,make test)Manual verification notes:
make -C /Users/danny/Development/encero/InQL ciincan test /Users/danny/Development/encero/InQL/tests/test_session.incnincan test /Users/danny/Development/encero/InQL/tests/test_substrait_plan.incnmake -C /Users/danny/Development/encero/InQL buildDocs impact
If docs updated:
docs/language/explanation/dataset_carriers.mddocs/language/reference/dataset_carriers.mddocs/language/reference/builders/aggregates.mddocs/language/reference/builders/filters.mddocs/language/reference/builders/projections.mddocs/language/reference/dataset_methods.mddocs/language/reference/functions/index.mddocs/architecture.mddocs/release_notes/v0_1.mddocs/rfcs/001_inql_dataset.mddocs/rfcs/012_unified_scalar_expression_surface.mdChecklist
Closes #23