Skip to content

Tvf 15575 new#11

Open
mohsaka wants to merge 98 commits into
masterfrom
tvf_15575_new
Open

Tvf 15575 new#11
mohsaka wants to merge 98 commits into
masterfrom
tvf_15575_new

Conversation

@mohsaka
Copy link
Copy Markdown
Owner

@mohsaka mohsaka commented Apr 2, 2025

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

mohsaka and others added 17 commits April 2, 2025 10:52
Changes adapted from trino/PR#11336
Original commit: 2e00c8e64c32d6fdd813999b2e04b3b3415235c8
Author: kasiafi

Modifications were made to adapt to Presto including:
Addition of KEEP in the parser.
Adjustment of the TestSqlParser.java to apply to Presto concepts.
Switch from Trino's DataType based datatypes to Presto's String based datatypes.

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: 395fd91c6480a993241eeabd9599873d0d05b24b
Author: kasiafi

Modifications were made to adapt to Presto including:
Removal of ConnectorExpression. Will be required in a future commit.

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: e69a052e570718ca568114e61644946feab4383e
Author: kasiafi

Modifications were made to adapt to Presto including:
Removal of the StatementAnalyzerFacotry calls.
Add TransactionManager to all StatementAnalyzer constructor calls.

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: 493f639a9daa5e6aaadbb4364587196cb5240fc2
Author: kasiafi

Modifications were made to adapt to Presto including:
Addition of TableFunctionRegistry into FunctionAndTypeManager
Removal of FunctionResolver
Addition of toPath to TableFunctionRegistry

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: 9e8d51ad45f57267f5f7fa6bf8e8c4ec56103dda
Author: kasiafi

Modifications were made to adapt to Presto including:
Removal of Node Location from TrinoException
Added new SemanticErrorCodes
Changed Void context to SqlPlannerContext in RelationPlanner.java
Add newUnqualified to Field class with Presto specification
Add getCanonicalValue to Identifier.java

Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: d4c73389bbdb6b48c24a0969b259286b05a99ade
Author: kasiafi

Modifications were made to adapt to Presto including:
Change CatalogName to ConnectorId
Modified and removed outputSymbols -> VariableReferenceExpression
Following visitTable example, created and used VariableReferenceExpression List Builder.
TableFunctionNode extends InternalPlanNode instead of PlanNode. Circular dependency and following other Node classes.

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: 565700985baff0c4b29fdb1e3e26139a29318b9e
Author: kasiafi

Modifications were made to adapt to Presto including:
Add applyTableFunction to all implementations of Metadata
Change CatalogName to ConnectorId
Add empty ConnectorTableLayoutHandle to TableHandle in MetadataManger::applyTableFunction

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#11336
Original commit: ec8b9fd2b2cc9c8bc78c0ca1317dc34fcf2c48c7
Author: kasiafi

Modifications were made to adapt to Presto including:
Change Symbol to VariableReferenceExpression
Add additional arguments to TableScanNode calls
Removal of PlannerContext and replaced with Metadata

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Preparatory commit for supporting table functions in JDBC connectors

Changes adapted from trino/PR#11336
Original commit: b4d4b5102e878b7e38e13c0440432543a18f913e
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#12350
Original commit: 998315075343beecef962657b8cbf440d53cc13b
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#12407
Original commit: 712b8e98ff8a726f95295ac539159fc532628273
Original commit: 131dc44af97b31a2fa8115028d98d06671641bfa
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#12476
Original commit: 0da095e14b0855f89af3c4f254a5a60280fc7170
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#12531
Original commit: 18bb60262cb0850cf839c2b20b434344921f5122
Original commit: 4a7d72afb64f93a9748a4c6b4defc2d42bbae000
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#12813
Original commit: 5310671f80291394b12ba2ea746e4e60051aaff4
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#12910
Original commit: f0508a7ab420449c6e2960ecf1d0a8d7058242da
Author: kasiafi

Modifications were made to adapt to Presto including:
Addition of test case under TestJdbcDistributedQueries.java.
Removal of Trino JDBC test cases.
Changes adapted from trino/PR#12951
Original commit: b602e4e0065d00a2c9b1f645cf7ed2905bdd6078
Original commit: 98fc1ee8b29fca86f2a1b3abe4989524940333a6
Original commit: 0cec709ae1b6c707ca4810cdef8a852253167ab1
Author: kasiafi

Modifications were made to adapt to Presto including:
Addition of other Trino Mock testing classes.

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
@mohsaka mohsaka force-pushed the tvf_15575_new branch 3 times, most recently from 865472b to b1cf085 Compare April 2, 2025 20:50
Co-authored-by: Pratik Joseph Dabre <pdabre12@gmail.com>
Co-authored-by: Xin Zhang <desertsxin@gmail.com>
mohsaka and others added 8 commits April 2, 2025 14:10
Changes adapted from trino/PR#13106
Original commit: d0e2470c59f0de4be1e6320f1350b8806a72c548

Modifications were made to adapt to Presto including:
Currently not formatting the arguments as we do not have a printer for Scalar Arguments.
Changes adapted from trino/PR#13602
Original commit: 21695a1dfb33d5cb8bb195397d769222da5a6345
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#13602
Original commit: 8be4a486a525056ef6b361618280ae0185d7211b
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
…uments of table functions

Changes adapted from trino/PR#13649
Changes adapted from trino/PR#13602
Original commit: a63cc9c429cea7849cf723598719812cc8e5636c
Original commit: 31b2b79751944eed42eaa4538a52dbf65c0399a6
Author: kasiafi

Modifications were made to adapt to Presto including:
Updated tests to match Presto format.

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#13602
Original commit: a93597ff72d98d577b32f5da6ef562ec85a7aad6
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#13602
Original commit: 21acef5d028565b919e3eae2df9784fbc2317ae9
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Prune when empty is enforced for a table argument with row semantics.
For a table argument with set semantocs, keep when empty is the default,
and it can be changed to prune when empty.

Changes adapted from trino/PR#13602
Original commit: 9771edb626ea47f2333988178107c7a2dc46ec6b
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
Changes adapted from trino/PR#13602
Original commit: 15664c845f4958ecffbb0dca2596226ee899c12e
Author: kasiafi

Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com>
@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 11, 2025

The passThroughSpecification is listing the orderkey as the partitioningColumn. However it should be orderstatus as the PartitioningColumn

passThroughSpecification = {TableFunctionNode$PassThroughSpecification@18063} 
 declaredAsPassThrough = false
 columns = {SingletonImmutableList@18067}  size = 1
  0 = {TableFunctionNode$PassThroughColumn@18069} 
   outputVariables = {VariableReferenceExpression@17525} "orderkey"
   partitioningColumn = true

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 11, 2025

This is definitely not right. I put a breakpoint here but the translator is already incorrect.

            else if (tableArgument.getPartitionBy().isPresent()) {
                tableArgument.getPartitionBy().get().stream()
                        // the original symbols for partitioning columns, not coerced
                        .map(sourcePlanBuilder::translate)
                        .forEach(variable -> {
                            outputVariables.add(variable);
                            passThroughColumns.add(new PassThroughColumn(variable, true));
                        });
            }
expressionToVariables = {HashMap@17364}  size = 1
 {SymbolReference@17370} ""orderstatus"" -> {VariableReferenceExpression@17371} "orderkey"

@xin-zhang2 Please continue from here. Thanks!

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 11, 2025

I think I got it. The issue was that we were using the first column as the translation when adding it into the translator. So it was always orderkey.

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 11, 2025

Fixed.

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 11, 2025

Caused by: java.lang.IllegalArgumentException: Invalid node. Source symbols ([expr_0, field_4, x2, field_8, x3, expr_11]) not in source plan output ([field_9, field_5, input_1_partition_size, combined_row_number_24, marker, input_4_partition_size, combined_row_number_26, marker_30, field, combined_partition_size_27, input_3_partition_size, combined_partition_size, input_1_row_number, marker_28, combined_partition_size_25, input_2_partition_size, input_2_row_number, combined_row_number, field_1, input_3_row_number, marker_29, input_4_row_number])
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:165)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.checkDependencies(ValidateDependenciesChecker.java:898)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.lambda$visitTableFunctionProcessor$2(ValidateDependenciesChecker.java:215)
	at java.util.Optional.ifPresent(Optional.java:159)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.visitTableFunctionProcessor(ValidateDependenciesChecker.java:214)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.visitTableFunctionProcessor(ValidateDependenciesChecker.java:110)
	at com.facebook.presto.sql.planner.plan.TableFunctionProcessorNode.accept(TableFunctionProcessorNode.java:245)
	at com.facebook.presto.sql.planner.plan.InternalPlanNode.accept(InternalPlanNode.java:34)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.visitExchange(ValidateDependenciesChecker.java:712)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.visitExchange(ValidateDependenciesChecker.java:110)
	at com.facebook.presto.sql.planner.plan.ExchangeNode.accept(ExchangeNode.java:331)
	at com.facebook.presto.sql.planner.plan.InternalPlanNode.accept(InternalPlanNode.java:34)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.visitOutput(ValidateDependenciesChecker.java:457)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker$Visitor.visitOutput(ValidateDependenciesChecker.java:110)
	at com.facebook.presto.spi.plan.OutputNode.accept(OutputNode.java:98)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker.validate(ValidateDependenciesChecker.java:107)
	at com.facebook.presto.sql.planner.sanity.ValidateDependenciesChecker.validate(ValidateDependenciesChecker.java:102)
	at com.facebook.presto.sql.planner.sanity.PlanChecker.lambda$validateFinalPlan$0(PlanChecker.java:87)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
	at com.facebook.presto.sql.planner.sanity.PlanChecker.validateFinalPlan(PlanChecker.java:87)
	at com.facebook.presto.sql.Optimizer.validateAndOptimizePlan(Optimizer.java:126)
	at com.facebook.presto.execution.SqlQueryExecution.lambda$doCreateLogicalPlanAndOptimize$5(SqlQueryExecution.java:582)
	at com.facebook.presto.common.RuntimeStats.recordWallAndCpuTime(RuntimeStats.java:158)
	at com.facebook.presto.execution.SqlQueryExecution.doCreateLogicalPlanAndOptimize(SqlQueryExecution.java:580)
	at com.facebook.presto.common.RuntimeStats.recordWallAndCpuTime(RuntimeStats.java:158)
	at com.facebook.presto.execution.SqlQueryExecution.createLogicalPlanAndOptimize(SqlQueryExecution.java:551)
	at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:479)
	at com.facebook.presto.$gen.Presto_null__testversion____20250411_202601_4.run(Unknown Source)
	at com.facebook.presto.execution.SqlQueryManager.createQuery(SqlQueryManager.java:320)
	at com.facebook.presto.dispatcher.LocalDispatchQuery.lambda$startExecution$8(LocalDispatchQuery.java:210)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Before optimization

 source = {TableFunctionNode@17169} 
  tableArgumentProperties = {RegularImmutableList@17178}  size = 4
  name = "test_inputs_function"
  arguments = {RegularImmutableMap@17175}  size = 4
  outputVariables = {SingletonImmutableList@17176}  size = 1
  sources = {RegularImmutableList@17177}  size = 4
   0 = {ProjectNode@17185} 
    sourceLocation = {Optional@17191} "Optional[3:33]"
    source = {ProjectNode@17189} 
     sourceLocation = {Optional@17191} "Optional[3:33]"
     source = {ValuesNode@17238} 
     assignments = {Assignments@17239} 
     locality = {ProjectNode$Locality@17155} "UNKNOWN"
     id = {PlanNodeId@17240} "1"
     statsEquivalentPlanNode = {Optional@17143} "Optional.empty"
    assignments = {Assignments@17190} 
     assignments = {Collections$UnmodifiableMap@17195}  size = 1
      {VariableReferenceExpression@17200} "expr_0" -> {VariableReferenceExpression@17201} "expr"
     outputs = {Collections$UnmodifiableRandomAccessList@17196}  size = 1
    locality = {ProjectNode$Locality@17155} "UNKNOWN"
    id = {PlanNodeId@17192} "2"
    statsEquivalentPlanNode = {Optional@17143} "Optional.empty"
   1 = {ProjectNode@17186} 
   2 = {ProjectNode@17187} 
   3 = {ProjectNode@17188} 
  copartitioningLists = {RegularImmutableList@17179}  size = 0
  handle = {TableFunctionHandle@17180} 
  sourceLocation = {Optional@17143} "Optional.empty"
  id = {PlanNodeId@17181} "18"
  statsEquivalentPlanNode = {Optional@17143} "Optional.empty"
 assignments = {Assignments@17170} 
 locality = {ProjectNode$Locality@17155} "UNKNOWN"
 id = {PlanNodeId@17171} "19"
 statsEquivalentPlanNode = {Optional@17143} "Optional.empty"
assignments = {Assignments@17165} 
locality = {ProjectNode$Locality@17155} "UNKNOWN"
id = {PlanNodeId@17166} "20"
statsEquivalentPlanNode = {Optional@17143} "Optional.empty"
markerVariables = {Optional@17905} "Optional[{expr_0=marker, field_4=marker_28, x2=marker_28, field_8=marker_29, x3=marker_29, expr_11=marker_30}]"
 value = {RegularImmutableMap@17812}  size = 6
  {VariableReferenceExpression@17828} "expr_0" -> {VariableReferenceExpression@17829} "marker"
  {VariableReferenceExpression@17830} "field_4" -> {VariableReferenceExpression@17822} "marker_28"
  {VariableReferenceExpression@17821} "x2" -> {VariableReferenceExpression@17822} "marker_28"
  {VariableReferenceExpression@17823} "field_8" -> {VariableReferenceExpression@17824} "marker_29"
  {VariableReferenceExpression@17825} "x3" -> {VariableReferenceExpression@17824} "marker_29"
  {VariableReferenceExpression@17826} "expr_11" -> {VariableReferenceExpression@17827} "marker_30"

The inputs provided from the Source Project node

inputs = {RegularImmutableSet@17800}  size = 22
 0 = {VariableReferenceExpression@17834} "input_4_row_number"
 1 = {VariableReferenceExpression@17835} "marker_29"
 2 = {VariableReferenceExpression@17836} "combined_partition_size_27"
 3 = {VariableReferenceExpression@17837} "input_3_row_number"
 4 = {VariableReferenceExpression@17838} "input_3_partition_size"
 5 = {VariableReferenceExpression@17839} "input_4_partition_size"
 6 = {VariableReferenceExpression@17840} "input_1_row_number"
 7 = {VariableReferenceExpression@17841} "combined_partition_size_25"
 8 = {VariableReferenceExpression@17842} "input_1_partition_size"
 9 = {VariableReferenceExpression@17843} "combined_row_number_24"
 10 = {VariableReferenceExpression@17844} "input_2_row_number"
 11 = {VariableReferenceExpression@17845} "combined_row_number_26"
 12 = {VariableReferenceExpression@17846} "field_1"
 13 = {VariableReferenceExpression@17847} "marker"
 14 = {VariableReferenceExpression@17848} "marker_30"
 15 = {VariableReferenceExpression@17849} "field_5"
 16 = {VariableReferenceExpression@17850} "combined_partition_size"
 17 = {VariableReferenceExpression@17851} "field_9"
 18 = {VariableReferenceExpression@17852} "marker_28"
 19 = {VariableReferenceExpression@17853} "input_2_partition_size"
 20 = {VariableReferenceExpression@17854} "combined_row_number"
 21 = {VariableReferenceExpression@17855} "field"
source = {ProjectNode@17799} 
 sourceLocation = {Optional@17861} "Optional.empty"
 source = {ExchangeNode@17858} 
  type = {ExchangeNode$Type@17867} "REPARTITION"
  scope = {ExchangeNode$Scope@17868} "LOCAL"
  sources = {SingletonImmutableList@17869}  size = 1
   0 = {ProjectNode@17876} 
    sourceLocation = {Optional@17861} "Optional.empty"
    source = {ProjectNode@17877} 
     sourceLocation = {Optional@17861} "Optional.empty"
     source = {JoinNode@17882} 
      type = {JoinType@17887} "LEFT"
      left = {ProjectNode@17888} 
       sourceLocation = {Optional@17861} "Optional.empty"
       source = {JoinNode@18008} 
        type = {JoinType@17887} "LEFT"
        left = {ProjectNode@18013} 
        right = {ExchangeNode@18014} 
        criteria = {Collections$UnmodifiableRandomAccessList@18015}  size = 0
        outputVariables = {Collections$UnmodifiableRandomAccessList@18016}  size = 11
        filter = {Optional@17861} "Optional.empty"
        leftHashVariable = {Optional@17861} "Optional.empty"
        rightHashVariable = {Optional@17861} "Optional.empty"
        distributionType = {Optional@18017} "Optional[REPLICATED]"
        dynamicFilters = {Collections$UnmodifiableMap@18018}  size = 0
        sourceLocation = {Optional@17861} "Optional.empty"
        id = {PlanNodeId@18019} "145"
        statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
       assignments = {Assignments@18009} 
       locality = {ProjectNode$Locality@17860} "LOCAL"
       id = {PlanNodeId@18010} "146"
       statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
      right = {WindowNode@17889} 
       source = {ValuesNode@18022} 
        outputVariables = {Collections$UnmodifiableRandomAccessList@18029}  size = 1
        rows = {Collections$UnmodifiableRandomAccessList@18030}  size = 2
        valuesNodeLabel = {Optional@17861} "Optional.empty"
        sourceLocation = {Optional@18031} "Optional[6:33]"
        id = {PlanNodeId@18032} "15"
        statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
       prePartitionedInputs = {Collections$UnmodifiableSet@18023}  size = 0
       specification = {DataOrganizationSpecification@18024} 
       preSortedOrderPrefix = 0
       windowFunctions = {Collections$UnmodifiableMap@18025}  size = 2
       hashVariable = {Optional@17861} "Optional.empty"
       sourceLocation = {Optional@17861} "Optional.empty"
       id = {PlanNodeId@18026} "142"
       statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
      criteria = {Collections$UnmodifiableRandomAccessList@17890}  size = 0
      outputVariables = {Collections$UnmodifiableRandomAccessList@17891}  size = 16
      filter = {Optional@17861} "Optional.empty"
      leftHashVariable = {Optional@17861} "Optional.empty"
      rightHashVariable = {Optional@17861} "Optional.empty"
      distributionType = {Optional@17892} "Optional[REPLICATED]"
      dynamicFilters = {Collections$UnmodifiableMap@17893}  size = 0
      sourceLocation = {Optional@17861} "Optional.empty"
      id = {PlanNodeId@17894} "147"
      statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
     assignments = {Assignments@17883} 
     locality = {ProjectNode$Locality@17860} "LOCAL"
     id = {PlanNodeId@17884} "148"
     statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
    assignments = {Assignments@17878} 
    locality = {ProjectNode$Locality@17860} "LOCAL"
    id = {PlanNodeId@17879} "149"
    statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
  partitioningScheme = {PartitioningScheme@17870} "ConnectorTableMetadata{partitioning=Partitioning{handle=HASH, arguments=[field_1, field_5]}, outputLayout=[input_4_row_number, marker_29, combined_partition_size_27, input_3_row_number, input_3_partition_size, input_4_partition_size, input_1_row_number, combined_partition_size_25, input_1_partition_size, combined_row_number_24, input_2_row_number, combined_row_number_26, field_1, marker, marker_30, field_5, combined_partition_size, field_9, marker_28, input_2_partition_size, combined_row_number, field, $hashvalue], hashChannel=Optional[$hashvalue], replicateNullsAndAny=false, scaleWriters=false, encoding=COLUMNAR, bucketToPartition=Optional.empty}"
  inputs = {SingletonImmutableList@17871}  size = 1
  ensureSourceOrdering = false
  orderingScheme = {Optional@17861} "Optional.empty"
  sourceLocation = {Optional@17861} "Optional.empty"
  id = {PlanNodeId@17872} "1260"
  statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
 assignments = {Assignments@17859} 
  assignments = {Collections$UnmodifiableMap@17865}  size = 22
   {VariableReferenceExpression@17834} "input_4_row_number" -> {VariableReferenceExpression@17834} "input_4_row_number"
   {VariableReferenceExpression@17835} "marker_29" -> {VariableReferenceExpression@17835} "marker_29"
   {VariableReferenceExpression@17836} "combined_partition_size_27" -> {VariableReferenceExpression@17836} "combined_partition_size_27"
   {VariableReferenceExpression@17837} "input_3_row_number" -> {VariableReferenceExpression@17837} "input_3_row_number"
   {VariableReferenceExpression@17838} "input_3_partition_size" -> {VariableReferenceExpression@17838} "input_3_partition_size"
   {VariableReferenceExpression@17839} "input_4_partition_size" -> {VariableReferenceExpression@17839} "input_4_partition_size"
   {VariableReferenceExpression@17840} "input_1_row_number" -> {VariableReferenceExpression@17840} "input_1_row_number"
   {VariableReferenceExpression@17841} "combined_partition_size_25" -> {VariableReferenceExpression@17841} "combined_partition_size_25"
   {VariableReferenceExpression@17842} "input_1_partition_size" -> {VariableReferenceExpression@17842} "input_1_partition_size"
   {VariableReferenceExpression@17843} "combined_row_number_24" -> {VariableReferenceExpression@17843} "combined_row_number_24"
   {VariableReferenceExpression@17844} "input_2_row_number" -> {VariableReferenceExpression@17844} "input_2_row_number"
   {VariableReferenceExpression@17845} "combined_row_number_26" -> {VariableReferenceExpression@17845} "combined_row_number_26"
   {VariableReferenceExpression@17846} "field_1" -> {VariableReferenceExpression@17846} "field_1"
   {VariableReferenceExpression@17847} "marker" -> {VariableReferenceExpression@17847} "marker"
   {VariableReferenceExpression@17848} "marker_30" -> {VariableReferenceExpression@17848} "marker_30"
   {VariableReferenceExpression@17849} "field_5" -> {VariableReferenceExpression@17849} "field_5"
   {VariableReferenceExpression@17850} "combined_partition_size" -> {VariableReferenceExpression@17850} "combined_partition_size"
   {VariableReferenceExpression@17851} "field_9" -> {VariableReferenceExpression@17851} "field_9"
   {VariableReferenceExpression@17852} "marker_28" -> {VariableReferenceExpression@17852} "marker_28"
   {VariableReferenceExpression@17853} "input_2_partition_size" -> {VariableReferenceExpression@17853} "input_2_partition_size"
   {VariableReferenceExpression@17854} "combined_row_number" -> {VariableReferenceExpression@17854} "combined_row_number"
   {VariableReferenceExpression@17855} "field" -> {VariableReferenceExpression@17855} "field"
  outputs = {Collections$UnmodifiableRandomAccessList@17866}  size = 22
 locality = {ProjectNode$Locality@17860} "LOCAL"
 id = {PlanNodeId@17862} "1380"
 statsEquivalentPlanNode = {Optional@17861} "Optional.empty"
inputs = {RegularImmutableSet@17800}  size = 22
 0 = {VariableReferenceExpression@17834} "input_4_row_number"
 1 = {VariableReferenceExpression@17835} "marker_29"
 2 = {VariableReferenceExpression@17836} "combined_partition_size_27"
 3 = {VariableReferenceExpression@17837} "input_3_row_number"
 4 = {VariableReferenceExpression@17838} "input_3_partition_size"
 5 = {VariableReferenceExpression@17839} "input_4_partition_size"
 6 = {VariableReferenceExpression@17840} "input_1_row_number"
 7 = {VariableReferenceExpression@17841} "combined_partition_size_25"
 8 = {VariableReferenceExpression@17842} "input_1_partition_size"
 9 = {VariableReferenceExpression@17843} "combined_row_number_24"
 10 = {VariableReferenceExpression@17844} "input_2_row_number"
 11 = {VariableReferenceExpression@17845} "combined_row_number_26"
 12 = {VariableReferenceExpression@17846} "field_1"
 13 = {VariableReferenceExpression@17847} "marker"
 14 = {VariableReferenceExpression@17848} "marker_30"
 15 = {VariableReferenceExpression@17849} "field_5"
 16 = {VariableReferenceExpression@17850} "combined_partition_size"
 17 = {VariableReferenceExpression@17851} "field_9"
 18 = {VariableReferenceExpression@17852} "marker_28"
 19 = {VariableReferenceExpression@17853} "input_2_partition_size"
 20 = {VariableReferenceExpression@17854} "combined_row_number"
 21 = {VariableReferenceExpression@17855} "field"
passThroughSymbols = {RegularImmutableSet@17801}  size = 2
requiredSymbols = {RegularImmutableSet@17802}  size = 4
 0 = {VariableReferenceExpression@17805} "field"
 1 = {VariableReferenceExpression@17806} "field_1"
 2 = {VariableReferenceExpression@17807} "field_5"
 3 = {VariableReferenceExpression@17808} "field_9"

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 11, 2025

Marker variables are already set to these values at the time of the map call from UnaliasSymbolReferences(visitTableFunctionProcessor)

markerVariables = {Optional@17432} "Optional[{expr_0=marker, field_4=marker_28, x2=marker_28, field_8=marker_29, x3=marker_29, expr_11=marker_30}]"
 value = {RegularImmutableMap@17441}  size = 6
  {VariableReferenceExpression@17450} "expr_0" -> {VariableReferenceExpression@17451} "marker"
  {VariableReferenceExpression@17452} "field_4" -> {VariableReferenceExpression@17453} "marker_28"
  {VariableReferenceExpression@17454} "x2" -> {VariableReferenceExpression@17453} "marker_28"
  {VariableReferenceExpression@17455} "field_8" -> {VariableReferenceExpression@17456} "marker_29"
  {VariableReferenceExpression@17457} "x3" -> {VariableReferenceExpression@17456} "marker_29"
  {VariableReferenceExpression@17458} "expr_11" -> {VariableReferenceExpression@17459} "marker_30"

expr_0 is assigned to expr at Project Node.
expr is assigned to field at inner Project Node.
Values node returns field.

  source = {ProjectNode@17202} 
  sourceLocation = {Optional@17204} "Optional[3:33]"
  source = {ValuesNode@17208} 
   outputVariables = {Collections$UnmodifiableRandomAccessList@17213}  size = 1
    0 = {VariableReferenceExpression@17219} "field"
   rows = {Collections$UnmodifiableRandomAccessList@17214}  size = 3
   valuesNodeLabel = {Optional@17156} "Optional.empty"
   sourceLocation = {Optional@17204} "Optional[3:33]"
   id = {PlanNodeId@17215} "0"
   statsEquivalentPlanNode = {Optional@17156} "Optional.empty"
  assignments = {Assignments@17209} 
   assignments = {Collections$UnmodifiableMap@17230}  size = 1
    {VariableReferenceExpression@17235} "expr" -> {VariableReferenceExpression@17236} "field"
   outputs = {Collections$UnmodifiableRandomAccessList@17231}  size = 1
    0 = {VariableReferenceExpression@17235} "expr"
  locality = {ProjectNode$Locality@17168} "UNKNOWN"
  id = {PlanNodeId@17210} "1"
  statsEquivalentPlanNode = {Optional@17156} "Optional.empty"
 assignments = {Assignments@17203} 
  assignments = {Collections$UnmodifiableMap@17221}  size = 1
   {VariableReferenceExpression@17226} "expr_0" -> {VariableReferenceExpression@17227} "expr"
  outputs = {Collections$UnmodifiableRandomAccessList@17222}  size = 1
 locality = {ProjectNode$Locality@17168} "UNKNOWN"
 id = {PlanNodeId@17205} "2"
 statsEquivalentPlanNode = {Optional@17156} "Optional.empty"

We need to investigate where this double project node was turned into the window node with joins. I think when that occurs, we are losing track of expr_0

Post join, at least from the Project Node, expr_0 is still there and being assigned.

source = {ProjectNode@17589} 
 sourceLocation = {Optional@17598} "Optional[3:33]"
 source = {ProjectNode@17596} 
  sourceLocation = {Optional@17598} "Optional[3:33]"
  source = {ValuesNode@17602} 
  assignments = {Assignments@17603} 
   assignments = {Collections$UnmodifiableMap@17615}  size = 1
    {VariableReferenceExpression@17620} "expr" -> {VariableReferenceExpression@17621} "field"
   outputs = {Collections$UnmodifiableRandomAccessList@17616}  size = 1
  locality = {ProjectNode$Locality@17543} "UNKNOWN"
  id = {PlanNodeId@17604} "1"
  statsEquivalentPlanNode = {Optional@17534} "Optional.empty"
 assignments = {Assignments@17597} 
  assignments = {Collections$UnmodifiableMap@17624}  size = 1
   {VariableReferenceExpression@17629} "expr_0" -> {VariableReferenceExpression@17630} "expr"
  outputs = {Collections$UnmodifiableRandomAccessList@17625}  size = 1
 locality = {ProjectNode$Locality@17543} "UNKNOWN"
 id = {PlanNodeId@17599} "2"
 statsEquivalentPlanNode = {Optional@17534} "Optional.empty"
prePartitionedInputs = {Collections$UnmodifiableSet@17590}  size = 0
specification = {DataOrganizationSpecification@17591} 
preSortedOrderPrefix = 0
windowFunctions = {Collections$UnmodifiableMap@17592}  size = 2
hashVariable = {Optional@17534} "Optional.empty"
sourceLocation = {Optional@17534} "Optional.empty"
id = {PlanNodeId@17593} "139"
statsEquivalentPlanNode = {Optional@17534} "Optional.empty"

Output variables from the join are

0 = {VariableReferenceExpression@17629} "expr_0"
1 = {VariableReferenceExpression@17634} "input_1_row_number"
2 = {VariableReferenceExpression@17635} "input_1_partition_size"
3 = {VariableReferenceExpression@17636} "field_4"
4 = {VariableReferenceExpression@17637} "x2"
5 = {VariableReferenceExpression@17638} "input_2_row_number"
6 = {VariableReferenceExpression@17639} "input_2_partition_size"

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 12, 2025

So the issue is here, when we convert to an exchange.

   id = {PlanNodeId@17935} "143"
 filter = {Optional@17811} "Optional.empty"
 outputVariables = {Collections$UnmodifiableRandomAccessList@17918}  size = 11
  sourceLocation = {Optional@17811} "Optional.empty"
   rightHashVariable = {Optional@17811} "Optional.empty"
       rows = {Collections$UnmodifiableRandomAccessList@17954}  size = 3
     0 = {WindowNode@17945} 
id = {PlanNodeId@17912} "146"
 assignments = {Collections$UnmodifiableMap@18016}  size = 13
      windowFunctions = {Collections$UnmodifiableMap@17949}  size = 2
 distributionType = {Optional@17919} "Optional[REPLICATED]"
locality = {ProjectNode$Locality@17823} "LOCAL"
   sourceLocation = {Optional@17811} "Optional.empty"
   leftHashVariable = {Optional@17811} "Optional.empty"
 right = {ExchangeNode@17916} 
    5 = {VariableReferenceExpression@17968} "input_2_partition_size"
    scope = {ExchangeNode$Scope@17881} "LOCAL"
   statsEquivalentPlanNode = {Optional@17811} "Optional.empty"
      preSortedOrderPrefix = 0
         name = "field"
    1 = {VariableReferenceExpression@17964} "input_1_row_number"
   filter = {Optional@17811} "Optional.empty"
    type = {ExchangeNode$Type@17880} "REPARTITION"
  statsEquivalentPlanNode = {Optional@17811} "Optional.empty"
       sourceLocation = {Optional@17955} "Optional[3:33]"
 left = {ProjectNode@17915} 
      specification = {DataOrganizationSpecification@17948} 
assignments = {Assignments@17911} 
 rightHashVariable = {Optional@17811} "Optional.empty"
   right = {ExchangeNode@17930} 
 sourceLocation = {Optional@17811} "Optional.empty"
    0 = {VariableReferenceExpression@17963} "field"
      hashVariable = {Optional@17811} "Optional.empty"
    inputs = {SingletonImmutableList@17940}  size = 1
    sourceLocation = {Optional@17811} "Optional.empty"
    2 = {VariableReferenceExpression@17965} "input_1_partition_size"
   outputVariables = {Collections$UnmodifiableRandomAccessList@17932}  size = 6
   dynamicFilters = {Collections$UnmodifiableMap@17934}  size = 0
 criteria = {Collections$UnmodifiableRandomAccessList@17917}  size = 0
    partitioningScheme = {PartitioningScheme@17939} "ConnectorTableMetadata{partitioning=Partitioning{handle=ROUND_ROBIN, arguments=[]}, outputLayout=[field, input_1_row_number, input_1_partition_size], hashChannel=Optional.empty, replicateNullsAndAny=false, scaleWriters=false, encoding=COLUMNAR, bucketToPartition=Optional.empty}"
      id = {PlanNodeId@17950} "139"
      source = {ValuesNode@17946} 
         type = {IntegerType@17986} "integer"
  source = {JoinNode@17924} 
   criteria = {Collections$UnmodifiableRandomAccessList@17931}  size = 0
         sourceLocation = {Optional@17987} "Optional[3:33]"
  id = {PlanNodeId@17926} "144"
       statsEquivalentPlanNode = {Optional@17811} "Optional.empty"
 leftHashVariable = {Optional@17811} "Optional.empty"
      sourceLocation = {Optional@17811} "Optional.empty"
 dynamicFilters = {Collections$UnmodifiableMap@17920}  size = 0
   distributionType = {Optional@17933} "Optional[REPLICATED]"
    ensureSourceOrdering = false
   type = {JoinType@17900} "LEFT"
    id = {PlanNodeId@17941} "1255"
    3 = {VariableReferenceExpression@17966} "field_1"
    sources = {SingletonImmutableList@17938}  size = 1
    orderingScheme = {Optional@17811} "Optional.empty"
        0 = {VariableReferenceExpression@17960} "field"
    statsEquivalentPlanNode = {Optional@17811} "Optional.empty"
   left = {ExchangeNode@17929} 
 outputs = {Collections$UnmodifiableRandomAccessList@18017}  size = 13
      prePartitionedInputs = {Collections$UnmodifiableSet@17947}  size = 0
    4 = {VariableReferenceExpression@17967} "input_2_row_number"
      statsEquivalentPlanNode = {Optional@17811} "Optional.empty"
       id = {PlanNodeId@17956} "0"
  locality = {ProjectNode$Locality@17823} "LOCAL"
statsEquivalentPlanNode = {Optional@17811} "Optional.empty"
       valuesNodeLabel = {Optional@17811} "Optional.empty"
       outputVariables = {Collections$UnmodifiableRandomAccessList@17953}  size = 1
  assignments = {Assignments@17925} 
  {VariableReferenceExpression@18057} "combined_partition_size_25" -> {SpecialFormExpression@18058} "IF(GREATER_THAN(COALESCE(combined_partition_size, -1), COALESCE(input_3_partition_size, -1)), combined_partition_size, input_3_partition_size)"
  {VariableReferenceExpression@18055} "combined_row_number_24" -> {SpecialFormExpression@18056} "IF(GREATER_THAN(COALESCE(combined_row_number, -1), COALESCE(input_3_row_number, -1)), combined_row_number, input_3_row_number)"
  {VariableReferenceExpression@18053} "input_3_partition_size" -> {VariableReferenceExpression@18054} "input_3_partition_size"
  {VariableReferenceExpression@18033} "field" -> {VariableReferenceExpression@18034} "field"
  {VariableReferenceExpression@18035} "input_1_row_number" -> {VariableReferenceExpression@18036} "input_1_row_number"
  {VariableReferenceExpression@18037} "input_1_partition_size" -> {VariableReferenceExpression@18038} "input_1_partition_size"
  {VariableReferenceExpression@18039} "field_1" -> {VariableReferenceExpression@18040} "field_1"
  {VariableReferenceExpression@18041} "input_2_row_number" -> {VariableReferenceExpression@18042} "input_2_row_number"
  {VariableReferenceExpression@18043} "input_2_partition_size" -> {VariableReferenceExpression@18044} "input_2_partition_size"
  {VariableReferenceExpression@18045} "combined_row_number" -> {VariableReferenceExpression@18046} "combined_row_number"
  {VariableReferenceExpression@18047} "combined_partition_size" -> {VariableReferenceExpression@18048} "combined_partition_size"
  {VariableReferenceExpression@18049} "field_5" -> {VariableReferenceExpression@18050} "field_5"
  {VariableReferenceExpression@18051} "input_3_row_number" -> {VariableReferenceExpression@18052} "input_3_row_number"

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 12, 2025

field -> field and expr0 is gone. I think we need to modify the maker variables of TableFuncitonProcessorNode when we make this optimization. By the time we get to Unalias, it has already occurred. So we can't do it then, as expr(0) context has already disappeared.

@xin-zhang2
Copy link
Copy Markdown
Collaborator

Fixed the makerVariables issue.
Now the new issue happens in the last case of testInputPartitioning.

Caused by: com.facebook.presto.ExceededMemoryLimitException: Query exceeded per-node user memory limit of 1.42GB [Allocated: 1.42GB, Delta: 1.65MB (PartitionAndSort), Top Consumers: {PartitionAndSort=1.42GB, HashBuilderOperator=2.40MB, ScanFilterAndProjectOperator=41.19kB}, Details: [{"taskId":"2.0.0","reservation":"1.42GB","topConsumers":[{"type":"TableFunctionOperator","planNodeId":"9","reservations":["472.84MB","440.30MB","275.01MB","265.53MB"],"total":"1.42GB"},{"type":"HashBuilderOperator","planNodeId":"97","reservations":["1.75MB"],"total":"1.75MB","info":"LEFT;REPLICATED;"},{"type":"HashBuilderOperator","planNodeId":"99","reservations":["525.43kB"],"total":"525.43kB","info":"LEFT;REPLICATED;"}]}]]
	at com.facebook.presto.ExceededMemoryLimitException.exceededLocalUserMemoryLimit(ExceededMemoryLimitException.java:56)
	at com.facebook.presto.memory.QueryContext.enforceUserMemoryLimit(QueryContext.java:500)
	at com.facebook.presto.memory.QueryContext.updateUserMemory(QueryContext.java:196)
	at com.facebook.presto.memory.QueryContext$QueryMemoryReservationHandler.reserveMemory(QueryContext.java:469)
	at com.facebook.presto.memory.context.RootAggregatedMemoryContext.updateBytes(RootAggregatedMemoryContext.java:37)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.SimpleLocalMemoryContext.setBytes(SimpleLocalMemoryContext.java:82)
	at com.facebook.presto.operator.OperatorContext$InternalLocalMemoryContext.setBytes(OperatorContext.java:717)
	at com.facebook.presto.operator.OperatorContext$InternalLocalMemoryContext.setBytes(OperatorContext.java:711)
	at com.facebook.presto.operator.TableFunctionOperator$PartitionAndSort.updateMemoryUsage(TableFunctionOperator.java:464)
	at com.facebook.presto.operator.TableFunctionOperator$PartitionAndSort.process(TableFunctionOperator.java:444)
	at com.facebook.presto.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:262)
	at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
	at com.facebook.presto.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:249)
	at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
	at com.facebook.presto.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:249)
	at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
	at com.facebook.presto.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:249)
	at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
	at com.facebook.presto.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:249)
	at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
	at com.facebook.presto.operator.TableFunctionOperator.getOutput(TableFunctionOperator.java:341)
	at com.facebook.presto.operator.Driver.processInternal(Driver.java:441)
	at com.facebook.presto.operator.Driver.lambda$processFor$10(Driver.java:324)
	at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:750)
	at com.facebook.presto.operator.Driver.processFor(Driver.java:317)
	at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1079)
	at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:165)
	at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:621)
	at com.facebook.presto.$gen.Presto_null__testversion____20250414_193029_3.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 15, 2025

Was busy with meetings almost the whole day. Did a little investigating. Noticed that the source types are much larger in presto than in Trino. In Presto source types is 47 while in Trino its only 2. Expected positions is the same. This however seems to make the Presto pagesIndex at the start, much larger. I'm not sure if this is the cause of the issue however.

92,432 vs 324496.

Increasing the query memory limit did not help. Either

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 17, 2025

Current plan

EXPLAIN SELECT DISTINCT regionkey, nationkey FROM TABLE(system.test_inputs_function(input_1 => TABLE(tpch.tiny.nation),input_2 => TABLE(tpch.tiny.nation) PARTITION BY regionkey ORDER BY name,input_3 => TABLE(tpch.tiny.customer) PARTITION BY nationkey,input_4 => TABLE(tpch.tiny.customer)))
 actual column types:
 [varchar(26145)]
expected column types:
[bigint, bigint]

not equal
Actual rows (1 of 1 extra rows shown, 1 rows in total):
    [- Output[PlanNodeId 15][regionkey, nationkey] => [regionkey_2:bigint, nationkey_8:bigint]
        regionkey := regionkey_2 (1:25)
        nationkey := nationkey_8 (1:36)
    - RemoteStreamingExchange[PlanNodeId 1205][GATHER - COLUMNAR] => [regionkey_2:bigint, nationkey_8:bigint]
        - Project[PlanNodeId 1550][projectLocality = LOCAL] => [regionkey_2:bigint, nationkey_8:bigint]
            - Aggregate(FINAL)[regionkey_2, nationkey_8][$hashvalue][PlanNodeId 11] => [regionkey_2:bigint, nationkey_8:bigint, $hashvalue:bigint]
                - LocalExchange[PlanNodeId 1386][HASH][$hashvalue] (regionkey_2, nationkey_8) => [regionkey_2:bigint, nationkey_8:bigint, $hashvalue:bigint]
                    - RemoteStreamingExchange[PlanNodeId 1392][REPARTITION - COLUMNAR][$hashvalue_35] => [regionkey_2:bigint, nationkey_8:bigint, $hashvalue_35:bigint]
                        - Aggregate(PARTIAL)[regionkey_2, nationkey_8][$hashvalue_44][PlanNodeId 1390] => [regionkey_2:bigint, nationkey_8:bigint, $hashvalue_44:bigint]
                            - Project[PlanNodeId 10][projectLocality = LOCAL] => [regionkey_2:bigint, nationkey_8:bigint, $hashvalue_44:bigint]
                                    $hashvalue_44 := combine_hash(combine_hash(BIGINT'0', COALESCE($operator$hash_code(regionkey_2), BIGINT'0')), COALESCE($operator$hash_code(nationkey_8), BIGINT'0')) (1:169)
                                - TableFunctionDataProcessor{name=test_inputs_function, properOutputs=[boolean_result], partitionBy=[regionkey_2, nationkey_8], orderBy=[combined_row_number_30 ASC_NULLS_LAST]}[PlanNodeId 9] => [boolean_result:boolean, regionkey_2:bigint, nationkey_8:bigint]
                                    - LocalExchange[PlanNodeId 1343][HASH][$hashvalue_36] (regionkey_2, nationkey_8) => [marker:bigint, regionkey_2:bigint, input_2_partition_size:bigint, custkey_12:bigint, comment_9:varchar(117), marker_32:bigint, acctbal:double, phone:varchar(15), nationkey_0:bigint, marker_34:bigint, address:varchar(40), regionkey:bigint, comment_3:varchar(152), custkey:bigint, nationkey_8:bigint, address_14:varchar(40), name_13:varchar(25), nationkey:bigint, input_4_partition_size:bigint, row_number_4:bigint, combined_row_number_28:bigint, row_number:bigint, input_1_partition_size:bigint, combined_partition_size:bigint, acctbal_17:double, combined_row_number_30:bigint, comment_19:varchar(117), mktsegment_18:varchar(10), row_number_20:bigint, combined_row_number:bigint, marker_33:bigint, row_number_10:bigint, mktsegment:varchar(10), comment:varchar(152), input_4_row_number:bigint, input_3_partition_size:bigint, combined_partition_size_29:bigint, name_7:varchar(25), name:varchar(25), input_3_row_number:bigint, phone_16:varchar(15), combined_partition_size_31:bigint, input_2_row_number:bigint, name_1:varchar(25), input_1_row_number:bigint, nationkey_15:bigint, $hashvalue_36:bigint]
                                        - Project[PlanNodeId 101][projectLocality = LOCAL] => [marker:bigint, regionkey_2:bigint, input_2_partition_size:bigint, custkey_12:bigint, comment_9:varchar(117), marker_32:bigint, acctbal:double, phone:varchar(15), nationkey_0:bigint, marker_34:bigint, address:varchar(40), regionkey:bigint, comment_3:varchar(152), custkey:bigint, nationkey_8:bigint, address_14:varchar(40), name_13:varchar(25), nationkey:bigint, input_4_partition_size:bigint, row_number_4:bigint, combined_row_number_28:bigint, row_number:bigint, input_1_partition_size:bigint, combined_partition_size:bigint, acctbal_17:double, combined_row_number_30:bigint, comment_19:varchar(117), mktsegment_18:varchar(10), row_number_20:bigint, combined_row_number:bigint, marker_33:bigint, row_number_10:bigint, mktsegment:varchar(10), comment:varchar(152), input_4_row_number:bigint, input_3_partition_size:bigint, combined_partition_size_29:bigint, name_7:varchar(25), name:varchar(25), input_3_row_number:bigint, phone_16:varchar(15), combined_partition_size_31:bigint, input_2_row_number:bigint, name_1:varchar(25), input_1_row_number:bigint, nationkey_15:bigint, $hashvalue_43:bigint]
                                                marker := IF((input_1_row_number) = (combined_row_number_30), input_1_row_number, null)
                                                marker_32 := IF((input_2_row_number) = (combined_row_number_30), input_2_row_number, null)
                                                marker_34 := IF((input_4_row_number) = (combined_row_number_30), input_4_row_number, null)
                                                marker_33 := IF((input_3_row_number) = (combined_row_number_30), input_3_row_number, null)
                                                combined_partition_size_31 := IF((COALESCE(combined_partition_size_29, BIGINT'-1')) > (COALESCE(input_4_partition_size, BIGINT'-1')), combined_partition_size_29, input_4_partition_size)
                                            - Project[PlanNodeId 100][projectLocality = LOCAL] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint, combined_row_number:bigint, combined_partition_size:bigint, custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint, combined_row_number_28:bigint, combined_partition_size_29:bigint, custkey_12:bigint, name_13:varchar(25), address_14:varchar(40), nationkey_15:bigint, phone_16:varchar(15), acctbal_17:double, mktsegment_18:varchar(10), comment_19:varchar(117), row_number_20:bigint, input_4_row_number:bigint, input_4_partition_size:bigint, combined_row_number_30:bigint, $hashvalue_43:bigint]
                                                    combined_row_number_30 := IF((COALESCE(combined_row_number_28, BIGINT'-1')) > (COALESCE(input_4_row_number, BIGINT'-1')), combined_row_number_28, input_4_row_number)
                                                    $hashvalue_43 := combine_hash(combine_hash(BIGINT'0', COALESCE($operator$hash_code(regionkey_2), BIGINT'0')), COALESCE($operator$hash_code(nationkey_8), BIGINT'0')) (1:168)
                                                - LeftJoin[PlanNodeId 99][] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint, combined_row_number:bigint, combined_partition_size:bigint, custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint, combined_row_number_28:bigint, combined_partition_size_29:bigint, custkey_12:bigint, name_13:varchar(25), address_14:varchar(40), nationkey_15:bigint, phone_16:varchar(15), acctbal_17:double, mktsegment_18:varchar(10), comment_19:varchar(117), row_number_20:bigint, input_4_row_number:bigint, input_4_partition_size:bigint]
                                                        Distribution: REPLICATED
                                                    - Project[PlanNodeId 98][projectLocality = LOCAL] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint, combined_row_number:bigint, combined_partition_size:bigint, custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint, combined_row_number_28:bigint, combined_partition_size_29:bigint]
                                                            combined_row_number_28 := IF((COALESCE(combined_row_number, BIGINT'-1')) > (COALESCE(input_3_row_number, BIGINT'-1')), combined_row_number, input_3_row_number)
                                                            combined_partition_size_29 := IF((COALESCE(combined_partition_size, BIGINT'-1')) > (COALESCE(input_3_partition_size, BIGINT'-1')), combined_partition_size, input_3_partition_size)
                                                        - LeftJoin[PlanNodeId 97][] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint, combined_row_number:bigint, combined_partition_size:bigint, custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint]
                                                                Distribution: REPLICATED
                                                            - Project[PlanNodeId 96][projectLocality = LOCAL] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint, combined_row_number:bigint, combined_partition_size:bigint]
                                                                    combined_row_number := IF((COALESCE(input_1_row_number, BIGINT'-1')) > (COALESCE(input_2_row_number, BIGINT'-1')), input_1_row_number, input_2_row_number)
                                                                    combined_partition_size := IF((COALESCE(input_1_partition_size, BIGINT'-1')) > (COALESCE(input_2_partition_size, BIGINT'-1')), input_1_partition_size, input_2_partition_size)
                                                                - LeftJoin[PlanNodeId 95][] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint]
                                                                        Distribution: REPLICATED
                                                                    - LocalExchange[PlanNodeId 1337][ROUND_ROBIN] () => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint]
                                                                        - Window[PlanNodeId 91][] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint, input_1_row_number:bigint, input_1_partition_size:bigint]
                                                                                input_1_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                                                input_1_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                                            - LocalExchange[PlanNodeId 1336][SINGLE] () => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint]
                                                                                    Estimates: {source: CostBasedSourceInfo, rows: 25 (450B), cpu: 2,959.00, memory: 0.00, network: 2,959.00}
                                                                                - RemoteStreamingExchange[PlanNodeId 1198][GATHER - COLUMNAR] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint]
                                                                                        Estimates: {source: CostBasedSourceInfo, rows: 25 (450B), cpu: 2,959.00, memory: 0.00, network: 2,959.00}
                                                                                    - TableScan[PlanNodeId 0][TableHandle {connectorId='tpch', connectorHandle='nation:sf0.01', layout='Optional[nation:sf0.01]'}] => [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), row_number:bigint]
                                                                                            Estimates: {source: CostBasedSourceInfo, rows: 25 (450B), cpu: 2,959.00, memory: 0.00, network: 0.00}
                                                                                            row_number := tpch:row_number (1:96)
                                                                                            regionkey := tpch:regionkey (1:96)
                                                                                            nationkey := tpch:nationkey (1:96)
                                                                                            name := tpch:name (1:96)
                                                                                            comment := tpch:comment (1:96)
                                                                    - LocalExchange[PlanNodeId 1339][SINGLE] () => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint]
                                                                        - RemoteStreamingExchange[PlanNodeId 1200][GATHER - COLUMNAR] => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint]
                                                                            - Project[PlanNodeId 1547][projectLocality = LOCAL] => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, input_2_row_number:bigint, input_2_partition_size:bigint]
                                                                                - Window[PlanNodeId 92][partition by (regionkey_2), order by (name_1 ASC_NULLS_LAST)][$hashvalue_37] => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, $hashvalue_37:bigint, input_2_row_number:bigint, input_2_partition_size:bigint]
                                                                                        input_2_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                                                        input_2_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                                                    - LocalExchange[PlanNodeId 1338][HASH][$hashvalue_37] (regionkey_2) => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, $hashvalue_37:bigint]
                                                                                            Estimates: {source: CostBasedSourceInfo, rows: 25 (450B), cpu: 12,511.00, memory: 0.00, network: 3,184.00}
                                                                                        - RemoteStreamingExchange[PlanNodeId 1199][REPARTITION - COLUMNAR][$hashvalue_38] => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, $hashvalue_38:bigint]
                                                                                                Estimates: {source: CostBasedSourceInfo, rows: 25 (450B), cpu: 9,327.00, memory: 0.00, network: 3,184.00}
                                                                                            - ScanProject[PlanNodeId 1,1546][table = TableHandle {connectorId='tpch', connectorHandle='nation:sf0.01', layout='Optional[nation:sf0.01]'}, projectLocality = LOCAL] => [nationkey_0:bigint, name_1:varchar(25), regionkey_2:bigint, comment_3:varchar(152), row_number_4:bigint, $hashvalue_39:bigint]
                                                                                                    Estimates: {source: CostBasedSourceInfo, rows: 25 (450B), cpu: 2,959.00, memory: 0.00, network: 0.00}/{source: CostBasedSourceInfo, rows: 25 (450B), cpu: 6,143.00, memory: 0.00, network: 0.00}
                                                                                                    $hashvalue_39 := combine_hash(BIGINT'0', COALESCE($operator$hash_code(regionkey_2), BIGINT'0')) (1:131)
                                                                                                    comment_3 := tpch:comment (1:131)
                                                                                                    regionkey_2 := tpch:regionkey (1:131)
                                                                                                    row_number_4 := tpch:row_number (1:131)
                                                                                                    nationkey_0 := tpch:nationkey (1:131)
                                                                                                    name_1 := tpch:name (1:131)
                                                            - LocalExchange[PlanNodeId 1341][SINGLE] () => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint]
                                                                - RemoteStreamingExchange[PlanNodeId 1202][GATHER - COLUMNAR] => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint]
                                                                    - Project[PlanNodeId 1549][projectLocality = LOCAL] => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, input_3_row_number:bigint, input_3_partition_size:bigint]
                                                                        - Window[PlanNodeId 93][partition by (nationkey_8)][$hashvalue_40] => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, $hashvalue_40:bigint, input_3_row_number:bigint, input_3_partition_size:bigint]
                                                                                input_3_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                                                input_3_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                                            - LocalExchange[PlanNodeId 1340][HASH][$hashvalue_40] (nationkey_8) => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, $hashvalue_40:bigint]
                                                                                    Estimates: {source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 1,245,920.00, memory: 0.00, network: 314,855.00}
                                                                                - RemoteStreamingExchange[PlanNodeId 1201][REPARTITION - COLUMNAR][$hashvalue_41] => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, $hashvalue_41:bigint]
                                                                                        Estimates: {source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 931,065.00, memory: 0.00, network: 314,855.00}
                                                                                    - ScanProject[PlanNodeId 5,1548][table = TableHandle {connectorId='tpch', connectorHandle='customer:sf0.01', layout='Optional[customer:sf0.01]'}, projectLocality = LOCAL] => [custkey:bigint, name_7:varchar(25), address:varchar(40), nationkey_8:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_9:varchar(117), row_number_10:bigint, $hashvalue_42:bigint]
                                                                                            Estimates: {source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 301,355.00, memory: 0.00, network: 0.00}/{source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 616,210.00, memory: 0.00, network: 0.00}
                                                                                            $hashvalue_42 := combine_hash(BIGINT'0', COALESCE($operator$hash_code(nationkey_8), BIGINT'0')) (1:203)
                                                                                            custkey := tpch:custkey (1:203)
                                                                                            name_7 := tpch:name (1:203)
                                                                                            nationkey_8 := tpch:nationkey (1:203)
                                                                                            comment_9 := tpch:comment (1:203)
                                                                                            phone := tpch:phone (1:203)
                                                                                            acctbal := tpch:acctbal (1:203)
                                                                                            mktsegment := tpch:mktsegment (1:203)
                                                                                            row_number_10 := tpch:row_number (1:203)
                                                                                            address := tpch:address (1:203)
                                                    - Window[PlanNodeId 94][] => [custkey_12:bigint, name_13:varchar(25), address_14:varchar(40), nationkey_15:bigint, phone_16:varchar(15), acctbal_17:double, mktsegment_18:varchar(10), comment_19:varchar(117), row_number_20:bigint, input_4_row_number:bigint, input_4_partition_size:bigint]
                                                            input_4_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                            input_4_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                                                        - LocalExchange[PlanNodeId 1342][SINGLE] () => [custkey_12:bigint, name_13:varchar(25), address_14:varchar(40), nationkey_15:bigint, phone_16:varchar(15), acctbal_17:double, mktsegment_18:varchar(10), comment_19:varchar(117), row_number_20:bigint]
                                                                Estimates: {source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 301,355.00, memory: 0.00, network: 301,355.00}
                                                            - RemoteStreamingExchange[PlanNodeId 1203][GATHER - COLUMNAR] => [custkey_12:bigint, name_13:varchar(25), address_14:varchar(40), nationkey_15:bigint, phone_16:varchar(15), acctbal_17:double, mktsegment_18:varchar(10), comment_19:varchar(117), row_number_20:bigint]
                                                                    Estimates: {source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 301,355.00, memory: 0.00, network: 301,355.00}
                                                                - TableScan[PlanNodeId 8][TableHandle {connectorId='tpch', connectorHandle='customer:sf0.01', layout='Optional[customer:sf0.01]'}] => [custkey_12:bigint, name_13:varchar(25), address_14:varchar(40), nationkey_15:bigint, phone_16:varchar(15), acctbal_17:double, mktsegment_18:varchar(10), comment_19:varchar(117), row_number_20:bigint]
                                                                        Estimates: {source: CostBasedSourceInfo, rows: 1,500 (26.37kB), cpu: 301,355.00, memory: 0.00, network: 0.00}
                                                                        acctbal_17 := tpch:acctbal (1:263)
                                                                        address_14 := tpch:address (1:263)
                                                                        phone_16 := tpch:phone (1:263)
                                                                        mktsegment_18 := tpch:mktsegment (1:263)
                                                                        comment_19 := tpch:comment (1:263)
                                                                        row_number_20 := tpch:row_number (1:263)
                                                                        custkey_12 := tpch:custkey (1:263)
                                                                        name_13 := tpch:name (1:263)
                                                                        nationkey_15 := tpch:nationkey (1:263)
]

@mohsaka
Copy link
Copy Markdown
Owner Author

mohsaka commented Apr 17, 2025

Trino Explain

  "Trino version: testversion
Fragment 0 [HASH]
    Output layout: [regionkey_3, nationkey_7]
    Output partitioning: SINGLE []
    Output[columnNames = [regionkey, nationkey]]
    │   Layout: [regionkey_3:bigint, nationkey_7:bigint]
    │   Estimates: {rows: ? (?), cpu: 0, memory: 0B, network: 0B}
    │   regionkey := regionkey_3
    │   nationkey := nationkey_7
    └─ Aggregate[type = FINAL, keys = [regionkey_3, nationkey_7]]
       │   Layout: [regionkey_3:bigint, nationkey_7:bigint]
       │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: 0B}
       └─ LocalExchange[partitioning = HASH, arguments = [regionkey_3::bigint, nationkey_7::bigint]]
          │   Layout: [regionkey_3:bigint, nationkey_7:bigint]
          │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
          └─ RemoteSource[sourceFragmentIds = [1]]
                 Layout: [regionkey_3:bigint, nationkey_7:bigint]

Fragment 1 [SINGLE]
    Output layout: [regionkey_3, nationkey_7]
    Output partitioning: HASH [regionkey_3, nationkey_7]
    Aggregate[type = PARTIAL, keys = [regionkey_3, nationkey_7]]
    │   Layout: [regionkey_3:bigint, nationkey_7:bigint]
    └─ Project[]
       │   Layout: [regionkey_3:bigint, nationkey_7:bigint]
       │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
       └─ TableFunctionProcessor[name = test_inputs_function, properOutputs = [boolean_result], partitionBy = [regionkey_3, nationkey_7], orderBy = [combined_row_number_20 ASC NULLS LAST]]
          │   Layout: [boolean_result:boolean, regionkey_3:bigint, nationkey_7:bigint]
          └─ LocalExchange[partitioning = HASH, arguments = [regionkey_3::bigint, nationkey_7::bigint]]
             │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117), combined_row_number_20:bigint, marker:bigint, marker_22:bigint, marker_23:bigint, marker_24:bigint]
             │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
             └─ Project[]
                │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117), combined_row_number_20:bigint, marker:bigint, marker_22:bigint, marker_23:bigint, marker_24:bigint]
                │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
                │   marker := (CASE WHEN (input_1_row_number = combined_row_number_20) THEN input_1_row_number ELSE null::bigint END)
                │   marker_22 := (CASE WHEN (input_2_row_number = combined_row_number_20) THEN input_2_row_number ELSE null::bigint END)
                │   marker_23 := (CASE WHEN (input_3_row_number = combined_row_number_20) THEN input_3_row_number ELSE null::bigint END)
                │   marker_24 := (CASE WHEN (input_4_row_number = combined_row_number_20) THEN input_4_row_number ELSE null::bigint END)
                └─ Project[]
                   │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117), input_4_row_number:bigint, combined_row_number_20:bigint]
                   │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
                   │   combined_row_number_20 := (CASE WHEN (COALESCE(combined_row_number_18, bigint '-1') > COALESCE(input_4_row_number, bigint '-1')) THEN combined_row_number_18 ELSE input_4_row_number END)
                   └─ LeftJoin[filter = ((combined_row_number_18 = input_4_row_number) OR ((combined_row_number_18 > input_4_partition_size) AND (input_4_row_number = bigint '1')) OR ((input_4_row_number > combined_partition_size_19) AND (combined_row_number_18 = bigint '1'))), distribution = REPLICATED]
                      │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, combined_row_number_18:bigint, custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117), input_4_row_number:bigint]
                      │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: 0B}
                      │   Distribution: REPLICATED
                      ├─ Project[]
                      │  │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, combined_row_number_18:bigint, combined_partition_size_19:bigint]
                      │  │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
                      │  │   combined_row_number_18 := (CASE WHEN (COALESCE(combined_row_number, bigint '-1') > COALESCE(input_3_row_number, bigint '-1')) THEN combined_row_number ELSE input_3_row_number END)
                      │  │   combined_partition_size_19 := (CASE WHEN (COALESCE(combined_partition_size, bigint '-1') > COALESCE(input_3_partition_size, bigint '-1')) THEN combined_partition_size ELSE input_3_partition_size END)
                      │  └─ LeftJoin[filter = ((combined_row_number = input_3_row_number) OR ((combined_row_number > input_3_partition_size) AND (input_3_row_number = bigint '1')) OR ((input_3_row_number > combined_partition_size) AND (combined_row_number = bigint '1'))), distribution = REPLICATED]
                      │     │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, combined_row_number:bigint, combined_partition_size:bigint, custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, input_3_partition_size:bigint]
                      │     │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: 0B}
                      │     │   Distribution: REPLICATED
                      │     ├─ Project[]
                      │     │  │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, combined_row_number:bigint, combined_partition_size:bigint]
                      │     │  │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
                      │     │  │   combined_row_number := (CASE WHEN (COALESCE(input_1_row_number, bigint '-1') > COALESCE(input_2_row_number, bigint '-1')) THEN input_1_row_number ELSE input_2_row_number END)
                      │     │  │   combined_partition_size := (CASE WHEN (COALESCE(input_1_partition_size, bigint '-1') > COALESCE(input_2_partition_size, bigint '-1')) THEN input_1_partition_size ELSE input_2_partition_size END)
                      │     │  └─ LeftJoin[filter = ((input_1_row_number = input_2_row_number) OR ((input_1_row_number > input_2_partition_size) AND (input_2_row_number = bigint '1')) OR ((input_2_row_number > input_1_partition_size) AND (input_1_row_number = bigint '1'))), distribution = REPLICATED]
                      │     │     │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, input_1_partition_size:bigint, nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, input_2_partition_size:bigint]
                      │     │     │   Estimates: {rows: ? (?), cpu: ?, memory: ?, network: 0B}
                      │     │     │   Distribution: REPLICATED
                      │     │     ├─ LocalExchange[partitioning = ROUND_ROBIN]
                      │     │     │  │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, input_1_partition_size:bigint]
                      │     │     │  │   Estimates: {rows: ? (?), cpu: ?, memory: 0B, network: 0B}
                      │     │     │  └─ Window[]
                      │     │     │     │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152), input_1_row_number:bigint, input_1_partition_size:bigint]
                      │     │     │     │   input_1_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                      │     │     │     │   input_1_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                      │     │     │     └─ LocalExchange[partitioning = SINGLE]
                      │     │     │        │   Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
                      │     │     │        │   Estimates: {rows: 25 (2.67kB), cpu: 0, memory: 0B, network: 0B}
                      │     │     │        └─ RemoteSource[sourceFragmentIds = [2]]
                      │     │     │               Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
                      │     │     └─ LocalExchange[partitioning = SINGLE]
                      │     │        │   Layout: [nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, input_2_partition_size:bigint]
                      │     │        │   Estimates: {rows: ? (?), cpu: 0, memory: 0B, network: 0B}
                      │     │        └─ RemoteSource[sourceFragmentIds = [3]]
                      │     │               Layout: [nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, input_2_partition_size:bigint]
                      │     └─ LocalExchange[partitioning = SINGLE]
                      │        │   Layout: [custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, input_3_partition_size:bigint]
                      │        │   Estimates: {rows: ? (?), cpu: 0, memory: 0B, network: 0B}
                      │        └─ RemoteSource[sourceFragmentIds = [5]]
                      │               Layout: [custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, input_3_partition_size:bigint]
                      └─ Window[]
                         │   Layout: [custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117), input_4_row_number:bigint, input_4_partition_size:bigint]
                         │   input_4_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                         │   input_4_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
                         └─ LocalExchange[partitioning = SINGLE]
                            │   Layout: [custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117)]
                            │   Estimates: {rows: 1500 (268.00kB), cpu: 0, memory: 0B, network: 0B}
                            └─ RemoteSource[sourceFragmentIds = [7]]
                                   Layout: [custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117)]

Fragment 2 [SOURCE]
    Output layout: [nationkey, name, regionkey, comment]
    Output partitioning: SINGLE []
    TableScan[table = tpch:tiny:nation]
        Layout: [nationkey:bigint, name:varchar(25), regionkey:bigint, comment:varchar(152)]
        Estimates: {rows: 25 (2.67kB), cpu: 2.67k, memory: 0B, network: 0B}
        nationkey := tpch:nationkey
        name := tpch:name
        comment := tpch:comment
        regionkey := tpch:regionkey

Fragment 3 [HASH]
    Output layout: [nationkey_1, name_2, regionkey_3, comment_4, input_2_row_number, input_2_partition_size]
    Output partitioning: SINGLE []
    Window[partitionBy = [regionkey_3], orderBy = [name_2 ASC NULLS LAST]]
    │   Layout: [nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152), input_2_row_number:bigint, input_2_partition_size:bigint]
    │   input_2_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
    │   input_2_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
    └─ LocalExchange[partitioning = HASH, arguments = [regionkey_3::bigint]]
       │   Layout: [nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152)]
       │   Estimates: {rows: 25 (2.67kB), cpu: 2.67k, memory: 0B, network: 0B}
       └─ RemoteSource[sourceFragmentIds = [4]]
              Layout: [nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152)]

Fragment 4 [SOURCE]
    Output layout: [nationkey_1, name_2, regionkey_3, comment_4]
    Output partitioning: HASH [regionkey_3]
    TableScan[table = tpch:tiny:nation]
        Layout: [nationkey_1:bigint, name_2:varchar(25), regionkey_3:bigint, comment_4:varchar(152)]
        Estimates: {rows: 25 (2.67kB), cpu: 2.67k, memory: 0B, network: 0B}
        nationkey_1 := tpch:nationkey
        regionkey_3 := tpch:regionkey
        name_2 := tpch:name
        comment_4 := tpch:comment

Fragment 5 [HASH]
    Output layout: [custkey, name_6, address, nationkey_7, phone, acctbal, mktsegment, comment_8, input_3_row_number, input_3_partition_size]
    Output partitioning: SINGLE []
    Window[partitionBy = [nationkey_7]]
    │   Layout: [custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117), input_3_row_number:bigint, input_3_partition_size:bigint]
    │   input_3_row_number := row_number() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
    │   input_3_partition_size := count() ROWS UNBOUNDED_PRECEDING UNBOUNDED_FOLLOWING
    └─ LocalExchange[partitioning = HASH, arguments = [nationkey_7::bigint]]
       │   Layout: [custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117)]
       │   Estimates: {rows: 1500 (268.00kB), cpu: 268.00k, memory: 0B, network: 0B}
       └─ RemoteSource[sourceFragmentIds = [6]]
              Layout: [custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117)]

Fragment 6 [SOURCE]
    Output layout: [custkey, name_6, address, nationkey_7, phone, acctbal, mktsegment, comment_8]
    Output partitioning: HASH [nationkey_7]
    TableScan[table = tpch:tiny:customer]
        Layout: [custkey:bigint, name_6:varchar(25), address:varchar(40), nationkey_7:bigint, phone:varchar(15), acctbal:double, mktsegment:varchar(10), comment_8:varchar(117)]
        Estimates: {rows: 1500 (268.00kB), cpu: 268.00k, memory: 0B, network: 0B}
        address := tpch:address
        name_6 := tpch:name
        phone := tpch:phone
        nationkey_7 := tpch:nationkey
        comment_8 := tpch:comment
        mktsegment := tpch:mktsegment
        custkey := tpch:custkey
        acctbal := tpch:acctbal

Fragment 7 [SOURCE]
    Output layout: [custkey_10, name_11, address_12, nationkey_13, phone_14, acctbal_15, mktsegment_16, comment_17]
    Output partitioning: SINGLE []
    TableScan[table = tpch:tiny:customer]
        Layout: [custkey_10:bigint, name_11:varchar(25), address_12:varchar(40), nationkey_13:bigint, phone_14:varchar(15), acctbal_15:double, mktsegment_16:varchar(10), comment_17:varchar(117)]
        Estimates: {rows: 1500 (268.00kB), cpu: 268.00k, memory: 0B, network: 0B}
        nationkey_13 := tpch:nationkey
        comment_17 := tpch:comment
        phone_14 := tpch:phone
        acctbal_15 := tpch:acctbal
        mktsegment_16 := tpch:mktsegment
        custkey_10 := tpch:custkey
        address_12 := tpch:address
        name_11 := tpch:name

"

@mohsaka mohsaka force-pushed the tvf_15575_new branch 2 times, most recently from 3d0016e to 49985fa Compare April 18, 2025 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants