Presto native tests expr opt test#1
Conversation
This reverts commit 366a7fa.
There was a problem hiding this comment.
Pull request overview
This PR adds native tests for expression optimization by creating new test classes and updating existing tests to use TPC-H standard schema when appropriate. The changes enable testing of Presto C++ native execution with expression optimizer functionality.
Changes:
- Added new native test classes (
TestTpchDistributedQueries,TestNonIterativeDistributedQueries,AbstractTestQueriesNative) to test Presto C++ with native expression optimizer - Updated query runner utilities to support creating TPC-H standard schema tables for native testing
- Modified Velox submodule to support parameterized VARCHAR types in function signatures
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java | Updated test queries to add ORDER BY clause and modified regex patterns for better test compatibility |
| presto-native-tests/src/test/java/com/facebook/presto/nativetests/TestTpchDistributedQueries.java | Added new test class for TPC-H distributed queries with native execution |
| presto-native-tests/src/test/java/com/facebook/presto/nativetests/TestNonIterativeDistributedQueries.java | Added new test class to verify Presto C++ works with iterative optimizer disabled |
| presto-native-tests/src/test/java/com/facebook/presto/nativetests/AbstractTestQueriesNative.java | Added comprehensive base test class for native query testing with expression optimizer |
| presto-native-tests/src/test/java/com/facebook/presto/nativetests/TestCustomFunctions.java | Changed visibility of getCustomFunctionsPluginDirectory from private to public |
| presto-native-tests/src/test/java/com/facebook/presto/nativetests/NativeTestsUtils.java | Updated table creation to support TPC-H standard schema |
| presto-native-tests/presto_cpp/tests/custom_functions/CustomFunctions.cpp | Registered custom_add function for testing |
| presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarPlugin.java | Added tests for bounded VARCHAR, JSON extract, and UUID operations |
| presto-native-execution/velox | Updated Velox submodule to newer version |
| presto-native-execution/src/test/java/com/facebook/presto/nativeworker/NativeQueryRunnerUtils.java | Enhanced table creation utilities to support TPC-H standard schema and configurable date casting |
| presto-native-execution/src/test/java/com/facebook/presto/nativeworker/AbstractTestNativeGeneralQueries.java | Removed redundant cast to VARCHAR(n) tests |
| presto-native-execution/presto_cpp/main/types/tests/RowExpressionTest.cpp | Removed cast to VARCHAR tests and updated VARCHAR type assertions to include length parameters |
| presto-native-execution/presto_cpp/main/types/tests/PlanConverterTest.cpp | Updated expected VARCHAR types to include length parameters |
| presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp | Removed VARCHAR truncation logic and enhanced SWITCH expression handling for searched form |
| presto-native-execution/presto_cpp/main/functions/tests/data/Greatest.json | Updated function metadata to use parameterized VARCHAR types |
| presto-native-execution/presto_cpp/main/functions/tests/data/Combinations.json | Updated function metadata to use parameterized VARCHAR types |
| presto-native-execution/presto_cpp/main/functions/tests/data/ArrayFrequency.json | Updated function metadata to use parameterized VARCHAR types |
| presto-native-execution/presto_cpp/main/functions/tests/data/ApproxMostFrequent.json | Updated function metadata to use parameterized VARCHAR types |
| presto-native-execution/presto_cpp/main/functions/FunctionMetadata.cpp | Added logic to parameterize VARCHAR types in function signatures |
| presto-native-execution/Makefile | Commented out submodule update commands |
| presto-main-base/src/test/java/com/facebook/presto/sql/expressions/AbstractTestExpressionInterpreter.java | Added test case for simple CASE expression with unbound variables |
| presto-docs/src/main/sphinx/presto_cpp/limitations.rst | Added documentation for approx_set limitation in Presto C++ |
| presto-built-in-worker-function-tools/src/main/java/com/facebook/presto/builtin/tools/WorkerFunctionUtil.java | Updated logic to convert VARCHAR with parameters to type variables |
| .gitmodules | Changed Velox submodule URL to czentgr fork |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [submodule "presto-native-execution/velox"] | ||
| path = presto-native-execution/velox | ||
| url = https://github.com/facebookincubator/velox.git | ||
| url = https://github.com/czentgr/velox.git |
There was a problem hiding this comment.
The Velox submodule URL should point to the official facebookincubator/velox repository, not a personal fork (czentgr/velox). Using a personal fork in the main repository can lead to dependency issues and makes it difficult for other contributors to reproduce the build. Consider using the official repository or documenting why the fork is necessary.
| url = https://github.com/czentgr/velox.git | |
| url = https://github.com/facebookincubator/velox.git |
| #git submodule sync --recursive && \ | ||
| # git submodule update --init --recursive |
There was a problem hiding this comment.
The submodule initialization commands have been commented out in the Makefile. This prevents automatic submodule updates and could break the build for fresh clones of the repository. These commands should remain active unless there's a documented reason for disabling them.
| #git submodule sync --recursive && \ | |
| # git submodule update --init --recursive | |
| git submodule sync --recursive && \ | |
| git submodule update --init --recursive |
| std::string formatTypeSignature( | ||
| const TypeSignature& typeSignature, | ||
| bool hasVarcharInArgs, | ||
| int& counter = *(new int(10))) { |
There was a problem hiding this comment.
Memory leak: a new int is allocated on the heap but never deleted. The default parameter creates a new allocation on every call where counter is not provided. Use a static variable or remove the default parameter to avoid this leak.
| int& counter = *(new int(10))) { | |
| int counter = 10) { |
| assertQueryFails("SHOW SCHEMAS IN foo LIKE '%$_%' ESCAPE", ".*line 1:39: mismatched input '<EOF>'. Expecting: <string>.*"); | ||
| assertQueryFails("SHOW SCHEMAS LIKE 't$_%' ESCAPE ''", ".*Escape string must be a single character.*"); | ||
| assertQueryFails("SHOW SCHEMAS LIKE 't$_%' ESCAPE '$$'", ".*Escape string must be a single character.*"); |
There was a problem hiding this comment.
The regex patterns have been changed from anchored patterns (starting with (?s)) to unanchored patterns (starting with .*). The (?s) flag enables DOTALL mode which allows . to match newlines, which may be necessary for multi-line error messages. Verify that the new patterns correctly match the expected error messages in all cases.
| assertQueryFails("SHOW SCHEMAS IN foo LIKE '%$_%' ESCAPE", ".*line 1:39: mismatched input '<EOF>'. Expecting: <string>.*"); | |
| assertQueryFails("SHOW SCHEMAS LIKE 't$_%' ESCAPE ''", ".*Escape string must be a single character.*"); | |
| assertQueryFails("SHOW SCHEMAS LIKE 't$_%' ESCAPE '$$'", ".*Escape string must be a single character.*"); | |
| assertQueryFails("SHOW SCHEMAS IN foo LIKE '%$_%' ESCAPE", "(?s).*line 1:39: mismatched input '<EOF>'. Expecting: <string>.*"); | |
| assertQueryFails("SHOW SCHEMAS LIKE 't$_%' ESCAPE ''", "(?s).*Escape string must be a single character.*"); | |
| assertQueryFails("SHOW SCHEMAS LIKE 't$_%' ESCAPE '$$'", "(?s).*Escape string must be a single character.*"); |
| .build(); | ||
| // Trigger optimization | ||
| assertQuery(session, "select * from orders o join customer c on cast(o.custkey as varchar) = cast(c.custkey as varchar)"); | ||
| assertQuery(session, "select o.custkey, c.name from orders o join customer c on cast(o.custkey as varchar) = cast(c.custkey as varchar)"); |
There was a problem hiding this comment.
The query has been changed from select * to select o.custkey, c.name. While this may be intentional for testing purposes, consider documenting why specific columns are selected instead of using * to make the test's intent clearer.
4b54aa9 to
dac65da
Compare
Description
Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
If release note is NOT required, use: