Skip to content

Test failures for s3#2

Draft
imsayari404 wants to merge 5 commits into
prestos3filesystemfrom
prestos3filesystem_testcases3
Draft

Test failures for s3#2
imsayari404 wants to merge 5 commits into
prestos3filesystemfrom
prestos3filesystem_testcases3

Conversation

@imsayari404
Copy link
Copy Markdown
Owner

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

…tream handling

Failures resolved :
1.TestHiveFileSystemS3SelectCsvPushdownWithSplits.testQueryPushdownWithSplitSizeForJson[3, 2, 15, 30]

2.TestHiveFileSystemS3SelectCsvPushdownWithSplits.testQueryPushdownWithSplitSizeForJson[50, 30, 2, 4]

3.TestHiveFileSystemS3SelectCsvPushdown.testFilterCommaDelimitedRecords

4.TestHiveFileSystemS3SelectCsvPushdown.testFilterPipeDelimitedRecords
…tadata retrieval

Failures resolved :

1.TestHiveFileSystemS3>AbstractTestHiveFileSystem.testGetFileStatus:328 » SdkClient Unable to marshall request to JSON: Key cannot be empty

2.TestHiveFileSystemS3SelectCsvPushdown>AbstractTestHiveFileSystem.testGetFileStatus:328 » SdkClient Unable to marshall request to JSON: Key cannot be empty.

After this fix getting :

a. TestHiveFileSystemS3>AbstractTestHiveFileSystem.testGetFileStatus:330 » IO unencrypted-content-length header is not set on an encrypted object: s3://presto-test-sayari/presto_test_external_fs/test1.csv

b. TestHiveFileSystemS3SelectCsvPushdown>AbstractTestHiveFileSystem.testGetFileStatus:330 » IO unencrypted-content-length header is not set on an encrypted object: s3://presto-test-sayari/presto_test_external_fs/test1.csv
Resolved :

1.TestHiveFileSystemS3>AbstractTestHiveFileSystem.testGetRecords:310->AbstractTestHiveFileSystem.readTable:296 » NoClassDefFound net/jpountz/lz4/LZ4Factory
2.TestHiveFileSystemS3SelectCsvPushdown>AbstractTestHiveFileSystem.testGetRecords:310->AbstractTestHiveFileSystem.readTable:296 » NoClassDefFound net/jpountz/lz4/LZ4Factory
Resolved:

1. TestHiveFileSystemS3>AbstractTestHiveFileSystem.testGetFileStatus:330 » IO unencrypted-content-length header is not set on an encrypted object: s3://presto-test-sayari/presto_test_external_fs/test1.csv
2.TestHiveFileSystemS3>AbstractTestHiveFileSystem.testRename:345 » IO unencrypted-content-length header is not set on an encrypted object: s3://presto-test-sayari/318c731f-89fd-4096-b274-90c4352943b1/foo.txt
3.TestHiveFileSystemS3SelectCsvPushdown>AbstractTestHiveFileSystem.testGetFileStatus:330 » IO unencrypted-content-length header is not set on an encrypted object: s3://presto-test-sayari/presto_test_external_fs/test1.csv
4.TestHiveFileSystemS3SelectCsvPushdown>AbstractTestHiveFileSystem.testRename:345 » IO unencrypted-content-length header is not set on an encrypted object: s3://presto-test-sayari/29d59ad6-4d24-41ca-8fd6-17cd5ea235c0/foo.txt
@imsayari404 imsayari404 force-pushed the prestos3filesystem branch 2 times, most recently from a15d3ae to b6a5c69 Compare November 25, 2025 06:40
@imsayari404 imsayari404 force-pushed the prestos3filesystem branch 3 times, most recently from 23af9c5 to 04c2628 Compare January 29, 2026 10:37
imsayari404 pushed a commit that referenced this pull request Mar 25, 2026
)

## Summary:
This reverts the following PR
GitHub Pull Request: prestodb#27199
GitHub Author: Deepak Mehra mehradeeeepak@gmail.com
Meta internal revision:D96106074

Fully reverts the HAVING alias expansion feature, which threw
7 failed queries (2 different errors) on meta internal test suites. I
have verified that these errors are all caused by this change by
attempting to fix them and testing that the revert clears the errors.

## Errors:
1. java.sql.SQLException: Query failed (#20260316_083518_04833_pwriw):
type of variable 'expr_189' is expected 2. to be varchar(25), but the
actual type is varchar
java.sql.SQLException: Query failed (#20260316_083713_05663_pwriw):
Cannot nest aggregations inside aggregation 'sum':
["sum"(CTX_pe_revenue)]

## Root cause:
The PR added SELECT alias resolution in HAVING by rewriting
the HAVING predicate with OrderByExpressionRewriter. This caused two
bugs:
1. Type mismatches (5 failures): (AI reasoning) When a column name
matches a SELECT alias
(e.g., SELECT CASE...END AS category), the rewriter expands the column
reference in HAVING to the full CASE expression, changing its type from
varchar(25) to unbounded varchar. QueryPlanner.filter() uses the
rewritten expression via analysis.getHaving(), creating plan nodes with
mismatched types that TypeValidator rejects.
2. Nested aggregation (1 failure): (confirmed) When a SELECT alias
references an
aggregation and HAVING uses that alias inside another aggregation
(e.g., HAVING sum(total) where total = sum(x)), expansion creates
invalid nested aggregations sum(sum(x))
3. bigint vs double (1 failure): Same mechanism as issue 1 but with
numeric
type coercion differences.

## Fix:
I tried to fix these issues, but only managed to clear the error for #2.
Reverting the change due to these errors.
According to AI:
The HAVING alias resolution feature needs a different implementation
that
doesn't change expression types in the plan — likely by resolving
aliases
during planning rather than during analysis.

## Reproduction:
I can't share the reproducing queries since they reference meta internal
data, but these should give you an idea of how to reproduce the issues

Bug 1: Type mismatch (varchar(25) vs varchar)

```
  -- The SELECT aliases `category` to a CASE expression returning varchar strings.
  -- HAVING references `category` which the rewriter expands to the CASE expression,
  -- changing its type from the subquery's varchar(25) to unbounded varchar.
  -- Triggers: "type of variable 'expr_N' is expected to be varchar(25), but the actual type is varchar"

  WITH data AS (
      SELECT
          CAST('cpu' AS varchar(25)) AS category,
          true AS is_cpu,
          100.0 AS value
      UNION ALL
      SELECT CAST('gpu' AS varchar(25)), false, 200.0
  )
  SELECT
      (CASE
          WHEN COALESCE(category, CAST(is_cpu AS varchar)) IS NULL THEN 'total'
          WHEN is_cpu THEN 'cpu_total'
          ELSE category
      END) category,
      sum(value) AS total_value
  FROM data
  GROUP BY GROUPING SETS ((), (category), (is_cpu))
  HAVING (CASE
      WHEN COALESCE(category, CAST(is_cpu AS varchar)) IS NULL THEN 'total'
      WHEN is_cpu THEN 'cpu_total'
      ELSE category
  END) IS NOT NULL
```

  Bug 2: Nested aggregation (SYNTAX_ERROR)

```
  -- SELECT defines `total` as sum(revenue). HAVING uses sum(total).
  -- The rewriter expands `total` to `sum(revenue)`, creating sum(sum(revenue)).
  -- Triggers: "Cannot nest aggregations inside aggregation 'sum': ["sum"(revenue)]"

  SELECT
      region,
      sum(revenue) AS total
  FROM (
      VALUES ('us', 100), ('eu', 200), ('us', 150), ('eu', 50)
  ) AS t(region, revenue)
  GROUP BY region
  HAVING sum(total) > 100
```

Differential Revision: D97181706

## Summary by Sourcery

Revert support for resolving SELECT output aliases in HAVING clauses and
restore previous validation behavior.

Enhancements:
- Simplify OrderByExpressionRewriter by removing clause-specific error
context now that it is only used for ORDER BY.

Tests:
- Update analyzer tests to again treat alias references in HAVING as
invalid and remove related positive and error-coverage cases.


## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* Fix 2 bugs caused by Select Alias references in Having clause

Co-authored-by: mehradeeeepak@gmail.com <mehradeeeepak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant