Skip to content

Plan table function invocation with table arguments#5#7

Open
xin-zhang2 wants to merge 12 commits into
tvf_13602from
tvf_14175
Open

Plan table function invocation with table arguments#5#7
xin-zhang2 wants to merge 12 commits into
tvf_13602from
tvf_14175

Conversation

@xin-zhang2
Copy link
Copy Markdown
Collaborator

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

xin-zhang2 and others added 6 commits March 19, 2025 09:23
Changes adapted from trino/PR#14175
Original commit: 5c125b5ef0e355b7f89d4927171dc7dd029d0b18
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#14175
Original commit: a6f537d5519e34a4a46a411e6967d585b382c56f
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#14175
Original commit: 4666472b0188aa26087840cdb587cc6e4495edef
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#14175
Original commit: 1aea489884346822c812b1a242acc286e3e1248e
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#14175
Original commit: 80c7fa0519eea07d8417d23908e8d1f8774dc3cd
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
@mohsaka mohsaka force-pushed the tvf_14175 branch 4 times, most recently from 4e4d17f to f98676b Compare March 20, 2025 02:49
@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 20, 2025

Finished it up to a point where everything is compiling. However I don't think its fully correct. Two things to look out for when running the tests.

  1. Some issue with Unaliasing. Currently I'm calling canonical on the output variables as some of the other visit functions do. However the Trino implementation does much more. We might be able to use the Trino implementation if we can figure out where we are supposed to use Symbol and where we are supposed to use VariableReferenceExpression.

  2. Partition table in relation planner. We have a coerce call that calls coerce that we have with metadata. However it is not static. We can either find a way to find a way to call it with a QueryPlanner object or find a way to make it static. Can't currently make it static due to sqlPlannerContext.

Changes adapted from trino/PR#14175
Original commit: 8bd17171a8469b9351e2fd7d9f2f49f4af9ea209
Author: kasiafi

Modifications were made to adapt to Presto including:
Rewritting the UnaliasSymbolReferences off of Unnest Example
Add a static coerce function passing in required values based off of current coerce function

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 20, 2025

Late night fix. I think I got the coerce thing handled. Made a static method and passed in all of the required objects from RelationPlanner. So just number 1 now.

@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 20, 2025

Investigating current issue

2025-03-20T13:46:15.679-0600 INFO Test com.facebook.presto.sql.planner.TestTableFunctionInvocation::testTableFunctionInitialPlan() took 3.16m

java.lang.IllegalArgumentException: No mapping for expression: c1

	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:218)
	at com.facebook.presto.sql.planner.TranslationMap.get(TranslationMap.java:185)
	at com.facebook.presto.sql.planner.PlanBuilder.translate(PlanBuilder.java:84)
	at com.facebook.presto.sql.planner.QueryPlanner.coerce(QueryPlanner.java:537)
	at com.facebook.presto.sql.planner.RelationPlanner.visitTableFunctionInvocation(RelationPlanner.java:283)
	at com.facebook.presto.sql.planner.RelationPlanner.visitTableFunctionInvocation(RelationPlanner.java:145)
	at com.facebook.presto.sql.tree.TableFunctionInvocation.accept(TableFunctionInvocation.java:56)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at com.facebook.presto.sql.planner.RelationPlanner.process(RelationPlanner.java:181)
	at com.facebook.presto.sql.planner.RelationPlanner.visitAliasedRelation(RelationPlanner.java:339)
	at com.facebook.presto.sql.planner.RelationPlanner.visitAliasedRelation(RelationPlanner.java:145)
	at com.facebook.presto.sql.tree.AliasedRelation.accept(AliasedRelation.java:71)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at com.facebook.presto.sql.planner.RelationPlanner.process(RelationPlanner.java:181)
	at com.facebook.presto.sql.planner.QueryPlanner.planFrom(QueryPlanner.java:424)
	at com.facebook.presto.sql.planner.QueryPlanner.plan(QueryPlanner.java:207)
	at com.facebook.presto.sql.planner.RelationPlanner.visitQuerySpecification(RelationPlanner.java:884)
	at com.facebook.presto.sql.planner.RelationPlanner.visitQuerySpecification(RelationPlanner.java:145)
	at com.facebook.presto.sql.tree.QuerySpecification.accept(QuerySpecification.java:138)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at com.facebook.presto.sql.planner.RelationPlanner.process(RelationPlanner.java:181)
	at com.facebook.presto.sql.planner.QueryPlanner.planQueryBody(QueryPlanner.java:413)
	at com.facebook.presto.sql.planner.QueryPlanner.plan(QueryPlanner.java:190)
	at com.facebook.presto.sql.planner.RelationPlanner.visitQuery(RelationPlanner.java:877)
	at com.facebook.presto.sql.planner.RelationPlanner.visitQuery(RelationPlanner.java:145)
	at com.facebook.presto.sql.tree.Query.accept(Query.java:105)
	at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at com.facebook.presto.sql.planner.RelationPlanner.process(RelationPlanner.java:181)
	at com.facebook.presto.sql.planner.LogicalPlanner.createRelationPlan(LogicalPlanner.java:564)
	at com.facebook.presto.sql.planner.LogicalPlanner.planStatementWithoutOutput(LogicalPlanner.java:185)
	at com.facebook.presto.sql.planner.LogicalPlanner.planStatement(LogicalPlanner.java:160)
	at com.facebook.presto.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:144)
	at com.facebook.presto.sql.analyzer.BuiltInQueryAnalyzer.plan(BuiltInQueryAnalyzer.java:107)
	at com.facebook.presto.testing.LocalQueryRunner.lambda$createPlan$6(LocalQueryRunner.java:1204)
	at com.facebook.presto.common.RuntimeStats.recordWallAndCpuTime(RuntimeStats.java:158)
	at com.facebook.presto.testing.LocalQueryRunner.createPlan(LocalQueryRunner.java:1202)
	at com.facebook.presto.sql.planner.assertions.BasePlanTest.lambda$assertPlan$3(BasePlanTest.java:199)
	at com.facebook.presto.transaction.TransactionBuilder.execute(TransactionBuilder.java:151)
	at com.facebook.presto.testing.LocalQueryRunner.inTransaction(LocalQueryRunner.java:858)
	at com.facebook.presto.sql.planner.assertions.BasePlanTest.assertPlan(BasePlanTest.java:198)
	at com.facebook.presto.sql.planner.assertions.BasePlanTest.assertPlan(BasePlanTest.java:179)

With a breakpoint at

    public VariableReferenceExpression translate(Expression expression)
    {
        return translations.get(expression);
    }

The first value a is canTranslate is true. But the next value Identifier: c1 canTranslate is false.

When stepping into get() , TranslateNameIntoSymbols correctly outputs c1 as a field. However both expressionsToVariables and expressionsToExpressions are not filled in.

When running on a, expressionToVariables is already filled in

expressionToVariables = {HashMap@6253}  size = 1
 {StringLiteral@6243} "'a'" -> {VariableReferenceExpression@7909} "expr_0"
  key = {StringLiteral@6243} "'a'"
  value = {VariableReferenceExpression@7909} "expr_0"
   name = "expr_0"
   type = {VarcharType@8018} "varchar(1)"
   sourceLocation = {Optional@8019} "Optional[1:86]"
expressionToExpressions = {HashMap@6254}  size = 0

@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 21, 2025

After discussions with Aditi, we need to fill out the Translator with the fields from sourceDescriptor.getAllFieldCount().

Attempted to put in via
sourcePlanBuilder.getTranslations().put(new Identifier(name.get()), sourcePlan.getVariable(i));

However fails with c1 Not analyzed. Need to figure out where the analyzed c1 is and place it in the Translator.

Update:
Fixed using the Analyzed table argument.

sourcePlanBuilder.getTranslations().put(tableArgument.getPartitionBy().get().get(i), sourcePlan.getVariable(i));

@xin-zhang2
Copy link
Copy Markdown
Collaborator Author

Fixed some issues in debugging the tests. Now the failure of the test is a plan mismatch.

java.lang.AssertionError: Plan does not match,
	at com.facebook.presto.sql.planner.assertions.PlanAssert.assertPlan(PlanAssert.java:57)
	at com.facebook.presto.sql.planner.assertions.PlanAssert.assertPlan(PlanAssert.java:41)
	at com.facebook.presto.sql.planner.assertions.BasePlanTest.lambda$assertPlan$3(BasePlanTest.java:205)
	at com.facebook.presto.transaction.TransactionBuilder.execute(TransactionBuilder.java:151)
	at com.facebook.presto.testing.LocalQueryRunner.inTransaction(LocalQueryRunner.java:858)
	at com.facebook.presto.sql.planner.assertions.BasePlanTest.assertPlan(BasePlanTest.java:198)
	at com.facebook.presto.sql.planner.assertions.BasePlanTest.assertPlan(BasePlanTest.java:179)
	at com.facebook.presto.sql.planner.TestTableFunctionInvocation.testTableFunctionInitialPlan(TestTableFunctionInvocation.java:77)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
	at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
	at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
	at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
	at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
	at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)

The expected is



- anyTree
    - node(TableFunctionNode)
        TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@1bbae752, INPUT_3=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@460b6d54, INPUT_2=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@5cf87cfd, ID=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$ScalarArgumentValue@76075d65, LAYOUT=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$DescriptorArgumentValue@3a4ba480}, properOutputs=[OUTPUT], copartitioningLists=[[INPUT_1, INPUT_3]]}
        - anyTree
            - node(ProjectNode)
                bind c1 -> 'a'
                - node(ValuesNode)
                    ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}
        - anyTree
            - node(ValuesNode)
                ValuesMatcher{outputSymbolAliases={c2=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional[[[1]]]}
        - anyTree
            - node(ProjectNode)
                bind c3 -> 'b'
                - node(ValuesNode)
                    ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}

The actual is



- Output[PlanNodeId 23][column, c1, c2, c3] => [expr_24:boolean, expr_25:varchar(1), expr_26:integer, expr_27:varchar(1)]
        column := expr_24 (1:1)
        c1 := expr_25 (1:1)
        c2 := expr_26 (1:1)
        c3 := expr_27 (1:1)
    - Project[PlanNodeId 22][projectLocality = UNKNOWN] => [expr_24:boolean, expr_25:varchar(1), expr_26:integer, expr_27:varchar(1)]
            expr_24 := expr_20 (1:1)
            expr_25 := expr_21 (1:1)
            expr_26 := expr_22 (1:1)
            expr_27 := expr_23 (1:1)
        - Project[PlanNodeId 21][projectLocality = UNKNOWN] => [expr_20:boolean, expr_21:varchar(1), expr_22:integer, expr_23:varchar(1)]
                expr_20 := expr_16
                expr_21 := expr_17 (1:87)
                expr_22 := expr_18 (1:196)
                expr_23 := expr_19 (1:151)
            - Project[PlanNodeId 20][projectLocality = UNKNOWN] => [expr_16:boolean, expr_17:varchar(1), expr_18:integer, expr_19:varchar(1)]
                    expr_16 := expr_12
                    expr_17 := expr_13 (1:86)
                    expr_18 := expr_14 (1:195)
                    expr_19 := expr_15 (1:150)
                - Project[PlanNodeId 19][projectLocality = UNKNOWN] => [expr_12:boolean, expr_13:varchar(1), expr_14:integer, expr_15:varchar(1)]
                        expr_12 := column (1:22)
                        expr_13 := field (1:87)
                        expr_14 := field_6 (1:196)
                        expr_15 := field_11 (1:151)
                    - TableFunction[PlanNodeId 18]name => [column:boolean, field:varchar(1), field_6:integer, field_11:varchar(1)]
                            Arguments:INPUT_1 => TableArgument{partition by: [field], order by: [field ASC_NULLS_LAST]}INPUT_3 => TableArgument{partition by: [field_11], prune when empty}INPUT_2 => TableArgument{row semantics, prune when empty}ID => ScalarArgument{type=bigint, value=2001}LAYOUT => DescriptorArgument{(X boolean, Y bigint)}Co-partition: [(INPUT_1, INPUT_3)]
                        - [INPUT_1] Project[PlanNodeId 6][projectLocality = UNKNOWN] => [field:varchar(1)]
                                Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                            - Project[PlanNodeId 5][projectLocality = LOCAL] => [field:varchar(1)]
                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                    field := expr_2 (1:86)
                                - Project[PlanNodeId 4][projectLocality = UNKNOWN] => [expr_2:varchar(1)]
                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                        expr_2 := expr_1 (1:86)
                                    - Project[PlanNodeId 3][projectLocality = UNKNOWN] => [expr_1:varchar(1)]
                                            Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                            expr_1 := expr_0 (1:87)
                                        - Project[PlanNodeId 2][projectLocality = UNKNOWN] => [expr_0:varchar(1)]
                                                Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                                expr_0 := expr (1:86)
                                            - Project[PlanNodeId 1][projectLocality = UNKNOWN] => [expr:varchar(1)]
                                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                                    expr := VARCHAR'a'
                                                - Values[PlanNodeId 0] => []
                                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                                        ()
                        - [INPUT_2] Project[PlanNodeId 10][projectLocality = LOCAL] => [field_6:integer]
                                Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                field_6 := expr_5 (1:195)
                            - Project[PlanNodeId 9][projectLocality = UNKNOWN] => [expr_5:integer]
                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                    expr_5 := expr_4 (1:195)
                                - Project[PlanNodeId 8][projectLocality = UNKNOWN] => [expr_4:integer]
                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                        expr_4 := field_3 (1:196)
                                    - Values[PlanNodeId 7] => [field_3:integer]
                                            Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                            (INTEGER'1')
                        - [INPUT_3] Project[PlanNodeId 17][projectLocality = UNKNOWN] => [field_11:varchar(1)]
                                Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                            - Project[PlanNodeId 16][projectLocality = LOCAL] => [field_11:varchar(1)]
                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                    field_11 := expr_10 (1:150)
                                - Project[PlanNodeId 15][projectLocality = UNKNOWN] => [expr_10:varchar(1)]
                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                        expr_10 := expr_9 (1:150)
                                    - Project[PlanNodeId 14][projectLocality = UNKNOWN] => [expr_9:varchar(1)]
                                            Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                            expr_9 := expr_8 (1:151)
                                        - Project[PlanNodeId 13][projectLocality = UNKNOWN] => [expr_8:varchar(1)]
                                                Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                                expr_8 := expr_7 (1:150)
                                            - Project[PlanNodeId 12][projectLocality = UNKNOWN] => [expr_7:varchar(1)]
                                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                                    expr_7 := VARCHAR'b'
                                                - Values[PlanNodeId 11] => []
                                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (117B), cpu: ?, memory: ?, network: ?}
                                                        ()


@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 21, 2025

PlanMatchingVisitor investigation.

When we reach the plan node for the tableFunctionNode ID7 we do find some matches.

this = {PlanMatchingVisitor@7358} 
node = {TableFunctionNode@7395} 
pattern = {PlanMatchPattern@7343} "- anyTree\n    - node(TableFunctionNode)\n        TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@4364863}, properOutputs=[OUTPUT], copartitioningLists=[]}\n"
states = {RegularImmutableList@7410}  size = 2
 0 = {PlanMatchingState@7414} 
  patterns = {SingletonImmutableList@7416}  size = 1
   0 = {PlanMatchPattern@7343} "- anyTree\n    - node(TableFunctionNode)\n        TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@4364863}, properOutputs=[OUTPUT], copartitioningLists=[]}\n"
 1 = {PlanMatchingState@7415} 
  patterns = {SingletonImmutableList@7419}  size = 1
   0 = {PlanMatchPattern@7421} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@4364863}, properOutputs=[OUTPUT], copartitioningLists=[]}\n"

PlanMatching State Number 2 looks like the correct one.

When looping through state number 2,

state = {PlanMatchingState@8076} 
 patterns = {SingletonImmutableList@8079}  size = 1
  0 = {PlanMatchPattern@8024} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@3625a016}, properOutputs=[OUTPUT], copartitioningLists=[]}\n"
   matchers = {ArrayList@8082}  size = 2
   sourcePatterns = {RegularImmutableList@7921}  size = 0
   anyTree = false

It then does not find a sources match. As a result it exits here.

Breakpoint used for debugging
node.getId().equals(new PlanNodeId("7"))

In trino with default test the equivalent is
node.getId().equals(new PlanNodeId("15"))

this is the successful path in Trino

this = {PlanMatchingVisitor@11409} 
node = {TableFunctionNode@11403} 
pattern = {PlanMatchPattern@11500} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=TableArgumentValue[sourceIndex=0, rowSemantics=false, pruneWhenEmpty=false, passThroughColumns=true, specification=Optional[SpecificationProvider{partitionBy=[c1], orderBy=[c1], orderings={c1=ASC NULLS LAST}}], passThroughSymbols=[c1]], INPUT_3=TableArgumentValue[sourceIndex=2, rowSemantics=false, pruneWhenEmpty=true, passThroughColumns=false, specification=Optional[SpecificationProvider{partitionBy=[c3], orderBy=[], orderings={}}], passThroughSymbols=[c3]], INPUT_2=TableArgumentValue[sourceIndex=1, rowSemantics=true, pruneWhenEmpty=true, passThroughColumns=true, specification=Optional.empty, passThroughSymbols=[c2]], ID=ScalarArgumentValue[value=2001], LAYOUT=DescriptorArgumentValue[descriptor=Optional[io.trino.spi.function.table.Descriptor@19035ab1]]}, properOutputs=[OUTPUT], copartitioningLists=[[INPUT_1, INPUT_3]]}\n    - anyTree\n        - node(ProjectNode)\n            bind c1 ->"
states = {SingletonImmutableList@11501}  size = 1
result = {MatchResult@8939} "NO MATCH"
state = {PlanMatchingState@11502} 

In presto we hit part of the successful path

this = {PlanMatchingVisitor@7896} 
node = {TableFunctionNode@7892} 
pattern = {PlanMatchPattern@7972} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@4d634127}, properOutputs=[OUTPUT], copartitioningLists=[]}\n"

States is empty at this point

List<PlanMatchingState> states = pattern.shapeMatches(node);

States in trino at this point

states = {SingletonImmutableList@11501}  size = 1
 0 = {PlanMatchingState@11502} 
  patterns = {RegularImmutableList@11521}  size = 3
   0 = {PlanMatchPattern@11523} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> varchar(1) 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"
   1 = {PlanMatchPattern@11524} "- anyTree\n    - node(ValuesNode)\n        ValuesMatcher{outputSymbolAliases={c2=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional[[([1]::integer)]]}\n"
   2 = {PlanMatchPattern@11525} "- anyTree\n    - node(ProjectNode)\n        bind c3 -> varchar(1) 'b'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"

Error was due to not including

                        anyTree(project(ImmutableMap.of("c1", expression("'a'")), values("1")))));

@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 21, 2025

Equivalent visitPlan

Presto:

this = {PlanMatchingVisitor@7884} 
node = {TableFunctionNode@7880} 
pattern = {PlanMatchPattern@7923} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@4a92c6a9}, properOutputs=[OUTPUT], copartitioningLists=[]}\n    - anyTree\n        - node(ProjectNode)\n            bind c1 -> 'a'\n            - node(ValuesNode)\n                ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"

Trino:

this = {PlanMatchingVisitor@8933} 
node = {TableFunctionNode@11442} 
 name = "different_arguments_function"
 functionCatalog = {CatalogHandle@11479} "mock"
 arguments = {RegularImmutableMap@11480}  size = 5
 properOutputs = {SingletonImmutableList@11481}  size = 1
 sources = {RegularImmutableList@11482}  size = 3
 tableArgumentProperties = {RegularImmutableList@11483}  size = 3
 copartitioningLists = {SingletonImmutableList@11484}  size = 1
 handle = {TableFunctionHandle@11485} "TableFunctionHandle[catalogHandle=mock, functionHandle=TestingTableFunctionHandle[name=system.different_arguments_function], transactionHandle=INSTANCE]"
 id = {PlanNodeId@11477} "15"
pattern = {PlanMatchPattern@11464} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_4=TableArgumentValue[sourceIndex=0, rowSemantics=false, pruneWhenEmpty=false, passThroughColumns=true, specification=Optional[SpecificationProvider{partitionBy=[c1], orderBy=[c1], orderings={c1=ASC NULLS LAST}}], passThroughSymbols=[c1]], INPUT_3=TableArgumentValue[sourceIndex=2, rowSemantics=false, pruneWhenEmpty=true, passThroughColumns=false, specification=Optional[SpecificationProvider{partitionBy=[c3], orderBy=[], orderings={}}], passThroughSymbols=[c3]], INPUT_2=TableArgumentValue[sourceIndex=1, rowSemantics=true, pruneWhenEmpty=true, passThroughColumns=true, specification=Optional.empty, passThroughSymbols=[c2]], ID=ScalarArgumentValue[value=2001], LAYOUT=DescriptorArgumentValue[descriptor=Optional[io.trino.spi.function.table.Descriptor@19035ab1]]}, properOutputs=[OUTPUT], copartitioningLists=[[INPUT_1, INPUT_3]]}\n    - anyTree\n        - node(ProjectNode)\n            bind c1 ->"

Trino states:

states = {SingletonImmutableList@11493}  size = 1
 0 = {PlanMatchingState@11497} 
  patterns = {RegularImmutableList@11498}  size = 3
   0 = {PlanMatchPattern@11500} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> varchar(1) 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"
   1 = {PlanMatchPattern@11501} "- anyTree\n    - node(ValuesNode)\n        ValuesMatcher{outputSymbolAliases={c2=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional[[([1]::integer)]]}\n"
   2 = {PlanMatchPattern@11502} "- anyTree\n    - node(ProjectNode)\n        bind c3 -> varchar(1) 'b'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"

Presto states:

states = {SingletonImmutableList@7936}  size = 1
0 = {PlanMatchingState@7940} 
 patterns = {SingletonImmutableList@7941}  size = 1
  0 = {PlanMatchPattern@7943} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"

Difference in matchSources

Expected matching PlanNode source from Trino,

result = {RegularImmutableList@11492}  size = 3
0 = {ProjectNode@11446} 
 source = {ProjectNode@11449} 
  source = {ProjectNode@11452} 
   source = {ProjectNode@11580} 
    source = {ProjectNode@11584} 
     source = {ValuesNode@11588} 
      outputSymbols = {RegularImmutableList@11592}  size = 0
      rowCount = 1
      rows = {Optional@11551} "Optional.empty"
      id = {PlanNodeId@11593} "0"
     assignments = {Assignments@11589} 
      assignments = {SingletonImmutableBiMap@11596}  size = 1
       {Symbol@11576} "expr::[varchar(1)]" -> {Constant@11686} "[a]::varchar(1)"
     id = {PlanNodeId@11590} "1"
    assignments = {Assignments@11585} 
    id = {PlanNodeId@11586} "2"
   assignments = {Assignments@11581} 
   id = {PlanNodeId@11582} "3"
  assignments = {Assignments@11536} 
  id = {PlanNodeId@11450} "4"
 assignments = {Assignments@11509} 
 id = {PlanNodeId@11447} "5"

Presto's

result = {SingletonImmutableList@7948}  size = 1
0 = {ProjectNode@7901} 
source = {ProjectNode@7904} 
 source = {ProjectNode@7907} 
  source = {ProjectNode@7971} 
   source = {ProjectNode@7976} 
    source = {ProjectNode@7981} 
     source = {ValuesNode@7986} 
      outputVariables = {Collections$UnmodifiableRandomAccessList@7991}  size = 0
      rows = {Collections$UnmodifiableRandomAccessList@7992}  size = 1
      valuesNodeLabel = {Optional@7954} "Optional.empty"
      sourceLocation = {Optional@7954} "Optional.empty"
      id = {PlanNodeId@7993} "0"
      statsEquivalentPlanNode = {Optional@7954} "Optional.empty"
     assignments = {Assignments@7987} 
      assignments = {Collections$UnmodifiableMap@7996}  size = 1
       {VariableReferenceExpression@8001} "expr" -> {ConstantExpression@8002} "Slice{base=[B@64c781a9, address=16, length=1}"
      outputs = {Collections$UnmodifiableRandomAccessList@7997}  size = 1
     locality = {ProjectNode$Locality@7952} "UNKNOWN"
     sourceLocation = {Optional@7954} "Optional.empty"
     id = {PlanNodeId@7988} "1"
     statsEquivalentPlanNode = {Optional@7954} "Optional.empty"
    assignments = {Assignments@7982} 
    locality = {ProjectNode$Locality@7952} "UNKNOWN"
    sourceLocation = {Optional@7954} "Optional.empty"
    id = {PlanNodeId@7983} "2"
    statsEquivalentPlanNode = {Optional@7954} "Optional.empty"
   assignments = {Assignments@7977} 
   locality = {ProjectNode$Locality@7952} "UNKNOWN"
   sourceLocation = {Optional@7954} "Optional.empty"
   id = {PlanNodeId@7978} "3"
   statsEquivalentPlanNode = {Optional@7954} "Optional.empty"
  assignments = {Assignments@7972} 
  locality = {ProjectNode$Locality@7952} "UNKNOWN"
  sourceLocation = {Optional@7954} "Optional.empty"
  id = {PlanNodeId@7973} "4"
  statsEquivalentPlanNode = {Optional@7954} "Optional.empty"
 assignments = {Assignments@7967} 
 locality = {ProjectNode$Locality@7968} "LOCAL"
 sourceLocation = {Optional@7953} "Optional[1:73]"
 id = {PlanNodeId@7905} "5"
 statsEquivalentPlanNode = {Optional@7954} "Optional.empty"
assignments = {Assignments@7951} 
locality = {ProjectNode$Locality@7952} "UNKNOWN"
sourceLocation = {Optional@7953} "Optional[1:73]"
id = {PlanNodeId@7902} "6"
statsEquivalentPlanNode = {Optional@7954} "Optional.empty"

At the point of hitting TableFunctionNode, we are still okay. Patterns continue without tableFunctionNode

this = {PlanMatchingVisitor@7894} 
node = {TableFunctionNode@7890} 
pattern = {PlanMatchPattern@7950} "- node(TableFunctionNode)\n    TableFunctionMatcher{name=different_arguments_function, arguments={INPUT_1=com.facebook.presto.sql.planner.assertions.TableFunctionMatcher$TableArgumentValue@4a92c6a9}, properOutputs=[OUTPUT], copartitioningLists=[]}\n    - anyTree\n        - node(ProjectNode)\n            bind c1 -> 'a'\n            - node(ValuesNode)\n                ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
states = {SingletonImmutableList@7976}  size = 1
result = {MatchResult@7983} "NO MATCH"
state = {PlanMatchingState@7986} 
patterns = {SingletonImmutableList@7991}  size = 1
0 = {PlanMatchPattern@7999} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"

In Trino, At this point we are brought down to values nodes.

this = {PlanMatchingVisitor@11393} 
node = {ProjectNode@12040} 
source = {ProjectNode@11452} 
assignments = {Assignments@12079} 
id = {PlanNodeId@12076} "4"
pattern = {PlanMatchPattern@11856} "- node(ProjectNode)\n    bind c1 -> varchar(1) 'a'\n    - node(ValuesNode)\n        ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"
states = {SingletonImmutableList@12078}  size = 1
result = {MatchResult@11786} "NO MATCH"
state = {PlanMatchingState@12082} 
patterns = {SingletonImmutableList@11990}  size = 1
0 = {PlanMatchPattern@11994} "- node(ValuesNode)\n    ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"

@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 21, 2025

Full plan output

- Output[PlanNodeId 12][column, c1] => [expr_9:boolean, expr_10:varchar(1)]
        column := expr_9 (1:1)
        c1 := expr_10 (1:1)
    - Project[PlanNodeId 11][projectLocality = UNKNOWN] => [expr_9:boolean, expr_10:varchar(1)]
            expr_9 := expr_7 (1:1)
            expr_10 := expr_8 (1:1)
        - Project[PlanNodeId 10][projectLocality = UNKNOWN] => [expr_7:boolean, expr_8:varchar(1)]
                expr_7 := expr_5
                expr_8 := expr_6 (1:87)
            - Project[PlanNodeId 9][projectLocality = UNKNOWN] => [expr_5:boolean, expr_6:varchar(1)]
                    expr_5 := expr_3
                    expr_6 := expr_4 (1:86)
                - Project[PlanNodeId 8][projectLocality = UNKNOWN] => [expr_3:boolean, expr_4:varchar(1)]
                        expr_3 := column (1:22)
                        expr_4 := field (1:87)
                    - TableFunction[PlanNodeId 7]different_arguments_function => [column:boolean, field:varchar(1)]
                            Arguments:INPUT_1 => TableArgument{partition by: [field], order by: [field ASC_NULLS_LAST]}
                        - [INPUT_1] Project[PlanNodeId 6][projectLocality = UNKNOWN] => [field:varchar(1)]
                                Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                            - Project[PlanNodeId 5][projectLocality = LOCAL] => [field:varchar(1)]
                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                                    field := expr_2 (1:86)
                                - Project[PlanNodeId 4][projectLocality = UNKNOWN] => [expr_2:varchar(1)]
                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                                        expr_2 := expr_1 (1:86)
                                    - Project[PlanNodeId 3][projectLocality = UNKNOWN] => [expr_1:varchar(1)]
                                            Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                                            expr_1 := expr_0 (1:87)
                                        - Project[PlanNodeId 2][projectLocality = UNKNOWN] => [expr_0:varchar(1)]
                                                Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                                                expr_0 := expr (1:86)
                                            - Project[PlanNodeId 1][projectLocality = UNKNOWN] => [expr:varchar(1)]
                                                    Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                                                    expr := VARCHAR'a'
                                                - Values[PlanNodeId 0] => []
                                                        Estimates: {source: CostBasedSourceInfo, rows: 1 (57B), cpu: ?, memory: ?, network: ?}
                                                        ()

On Node#6 we get

this = {PlanMatchingVisitor@7867} 
node = {ProjectNode@7955} 
 source = {ProjectNode@7964} 
 assignments = {Assignments@7965} 
 locality = {ProjectNode$Locality@7966} "UNKNOWN"
 sourceLocation = {Optional@7967} "Optional[1:73]"
 id = {PlanNodeId@7956} "6"
 statsEquivalentPlanNode = {Optional@7949} "Optional.empty"
pattern = {PlanMatchPattern@8134} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
states = {RegularImmutableList@8141}  size = 2
 0 = {PlanMatchingState@8147} 
  patterns = {SingletonImmutableList@8149}  size = 1
   0 = {PlanMatchPattern@8134} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
 1 = {PlanMatchingState@8148} 
  patterns = {SingletonImmutableList@8152}  size = 1
   0 = {PlanMatchPattern@8154} "- node(ProjectNode)\n    bind c1 -> 'a'\n    - node(ValuesNode)\n        ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"

We expect to see Node 2 take away - anyTree
We expect to see Node 1 take away - node(ProjectNode)\n bind c1 -> 'a'

On Node 2 we see the two possibilities here

this = {PlanMatchingVisitor@7867} 
node = {ProjectNode@7990} 
 source = {ProjectNode@8001} 
 assignments = {Assignments@8002} 
 locality = {ProjectNode$Locality@7977} "UNKNOWN"
 sourceLocation = {Optional@7965} "Optional.empty"
 id = {PlanNodeId@7998} "2"
 statsEquivalentPlanNode = {Optional@7965} "Optional.empty"
pattern = {PlanMatchPattern@8122} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
states = {RegularImmutableList@8160}  size = 2
 0 = {PlanMatchingState@8166} 
  patterns = {SingletonImmutableList@8168}  size = 1
   0 = {PlanMatchPattern@8122} "- anyTree\n    - node(ProjectNode)\n        bind c1 -> 'a'\n        - node(ValuesNode)\n            ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
 1 = {PlanMatchingState@8167} 
  patterns = {SingletonImmutableList@8169}  size = 1
   0 = {PlanMatchPattern@8171} "- node(ProjectNode)\n    bind c1 -> 'a'\n    - node(ValuesNode)\n        ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"

Values Node fails to match:

this = {PlanMatchingVisitor@7867} 
metadata = {MetadataManager@7870} 
session = {Session@7869} "Session{queryId=20250321_231405_00000_zw3zs, transactionId=Optional[c6f933ba-03b7-4b8b-8a83-2becd7f5a039], user=user, source=test, catalog=local, schema=tiny, timeZoneKey=Pacific/Apia, locale=en, remoteUserAddress=address, userAgent=agent, clientTags=[], resourceEstimates=ResourceEstimates{executionTime=Optional.empty, cpuTime=Optional.empty, peakMemory=Optional.empty, peakTaskMemory=Optional.empty}, startTime=1742598845738}"
statsProvider = {CachingStatsProvider@7868} 
lookup = {Lookup$lambda@7873} 
node = {ValuesNode@8028} 
outputVariables = {Collections$UnmodifiableRandomAccessList@8040}  size = 0
rows = {Collections$UnmodifiableRandomAccessList@8041}  size = 1
valuesNodeLabel = {Optional@7965} "Optional.empty"
sourceLocation = {Optional@7965} "Optional.empty"
id = {PlanNodeId@8037} "0"
statsEquivalentPlanNode = {Optional@7965} "Optional.empty"
pattern = {PlanMatchPattern@8280} "- node(ValuesNode)\n    ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
matchers = {ArrayList@8327}  size = 2
sourcePatterns = {RegularImmutableList@8322}  size = 0
anyTree = false
states = {RegularImmutableList@8322}  size = 0

No States

@mohsaka
Copy link
Copy Markdown
Owner

mohsaka commented Mar 21, 2025

Trino Value Matcher

this = {PlanMatchingVisitor@9024} 
node = {ValuesNode@9001} 
pattern = {PlanMatchPattern@9021} "- node(ValuesNode)\n    ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}\n"
 matchers = {ArrayList@11474}  size = 2
  0 = {PlanNodeMatcher@11477} "PlanNodeMatcher{nodeClass=class io.trino.sql.planner.plan.ValuesNode}"
  1 = {ValuesMatcher@11478} "ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}"
 sourcePatterns = {RegularImmutableList@11475}  size = 0
 anyTree = false

Presto Value Matcher

pattern = {PlanMatchPattern@7942} "- node(ValuesNode)\n    ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"
this = {PlanMatchingVisitor@7884} 
node = {ValuesNode@7919} 
this = {PlanMatchingVisitor@7884} 
node = {ValuesNode@7919} 
outputVariables = {Collections$UnmodifiableRandomAccessList@7947}  size = 0
rows = {Collections$UnmodifiableRandomAccessList@7948}  size = 1
valuesNodeLabel = {Optional@7949} "Optional.empty"
sourceLocation = {Optional@7949} "Optional.empty"
id = {PlanNodeId@7950} "0"
statsEquivalentPlanNode = {Optional@7949} "Optional.empty"
pattern = {PlanMatchPattern@7942} "- node(ValuesNode)\n    ValuesMatcher{outputSymbolAliases={1=0}, expectedOutputSymbolCount=Optional[1], expectedRows=Optional.empty}\n"

Breakpoint at visitPlan:
pattern.toString().startsWith("- node(ValuesNode)")

This is failing at visitPlan

    private boolean shapeMatchesMatchers(PlanNode node)
    {
        return matchers.stream().allMatch(it -> it.shapeMatches(node));
    }

Matcher 1 is true but matcher 2 is false.

The number of output variables from the values node does not match the expected number. We expect 1 but our node is 0.

node = {ValuesNode@7928} 
 outputVariables = {Collections$UnmodifiableRandomAccessList@8040}  size = 0
 rows = {Collections$UnmodifiableRandomAccessList@8041}  size = 1
 valuesNodeLabel = {Optional@8031} "Optional.empty"
 sourceLocation = {Optional@8031} "Optional.empty"
 id = {PlanNodeId@8042} "0"
 statsEquivalentPlanNode = {Optional@8031} "Optional.empty"

For some reason in presto we expect 1 but in trino we don't expect any.

- node(ValuesNode)
   ValuesMatcher{outputSymbolAliases={}, expectedOutputSymbolCount=Optional[0], expectedRows=Optional[[()]]}

@xin-zhang2
Copy link
Copy Markdown
Collaborator Author

xin-zhang2 commented Mar 24, 2025

@mohsaka Thanks for the analysis.

The mismatch in ValueMatcher is due to an incorrect setting for the expected value in the test.
Presto does not have a value() function that accepts an integer paremeter, and passing a numeric string in Presto is handled differently from the integer parameter in Trino.

There is another mismatch in TableFunctionMatcher. In Trino partitionBy and orderings are Symbols and only name is compared, while in Presto they are VariableReferenceExpression and both name and type are compared, and getExpectedValue will set the type to Unknown that is different from the actual type.

I've fixed these issues in the latest commit by adding the value() function and ignoring the type comparison in VariableReferenceExpression.
Now the test is passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants