Skip to content

Support Athena managed query result storage#665

Open
laughingman7743 wants to merge 2 commits intomasterfrom
fix/managed-query-result-storage
Open

Support Athena managed query result storage#665
laughingman7743 wants to merge 2 commits intomasterfrom
fix/managed-query-result-storage

Conversation

@laughingman7743
Copy link
Member

@laughingman7743 laughingman7743 commented Feb 18, 2026

Summary

Fixes #664

When a workgroup has managed query result storage enabled, GetQueryExecution does not return ResultConfiguration.OutputLocation. This caused file-based cursors (Pandas, Arrow, Polars, S3FS) to return empty results even though data was available via the GetQueryResults API.

Changes

API fallback for managed storage (result_set.py, pandas/result_set.py, arrow/result_set.py, polars/result_set.py, s3fs/result_set.py)

  • When output_location is None but the query succeeded, fall back to GetQueryResults API to fetch data
  • S3 file reading remains the primary path when output_location is available
  • Shared helpers: _fetch_all_rows(), _parse_result_rows(), _rows_to_columnar()

s3_staging_dir="" support (connection.py)

  • Pass empty string to explicitly disable AWS_ATHENA_S3_STAGING_DIR env var fallback
  • Required for managed workgroups where ResultConfiguration conflicts with ManagedQueryResultsConfiguration
  • Updated docstrings with usage notes

assertProgrammingError (11 occurrences across 8 files)

  • Replaced all assert statements used for input validation with proper ProgrammingError exceptions
  • assert can be stripped by python -O, making validation unreliable

Omit empty ResultConfiguration (common.py)

  • Don't send empty ResultConfiguration dict in StartQueryExecution when no S3 staging dir or encryption is configured

Tests (5 test files + tests/__init__.py)

  • test_fetch_all_rows added to all cursor types, parameterized by workgroup (default vs managed) using fixture indirect
  • AWS_ATHENA_MANAGED_WORKGROUP env var controls managed workgroup name; tests skip when not set

CI & Docs

  • GitHub Actions: added AWS_ATHENA_MANAGED_WORKGROUP env var
  • CloudFormation: added ManagedWorkGroup resource
  • docs/testing.md: documented managed workgroup setup

Test plan

  • All 10 test_fetch_all_rows tests pass (5 cursor types × 2 workgroup modes)
  • make fmt / make chk (ruff + mypy) pass
  • CI with managed workgroup (pyathena-managed)

🤖 Generated with Claude Code

laughingman7743 and others added 2 commits February 18, 2026 23:35
When a workgroup has managed query result storage enabled,
GetQueryExecution does not return ResultConfiguration.OutputLocation.
File-based cursors (Pandas, Arrow, Polars, S3FS) checked output_location
and returned empty results even though data was available via API.

Fall back to GetQueryResults API when output_location is None but the
query succeeded. S3 file reading remains the primary path when
output_location is available.

- Add _fetch_all_rows() to AthenaResultSet base class for paginated
  API fetching with DefaultTypeConverter
- Add _parse_result_rows() to share response parsing between
  _process_rows (normal path) and _fetch_all_rows (fallback path)
- Add _rows_to_columnar() for shared row-to-columnar conversion
- Extract __get_query_results() from __fetch() for reuse
- Add _as_*_from_api() fallback methods to each result set subclass
- Remove empty ResultConfiguration from StartQueryExecution request
  when no S3 staging dir or encryption is configured
- Emit warning log when falling back to API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…with ProgrammingError

- Add test_fetch_all_rows to all cursor test files (Cursor, Pandas, Arrow, Polars, S3FS)
  parameterized by workgroup (default vs managed) using fixture indirect
- Support s3_staging_dir="" to explicitly disable env var fallback, required
  for managed workgroups where ResultConfiguration conflicts with
  ManagedQueryResultsConfiguration
- Replace all assert statements with ProgrammingError exceptions across
  connection.py, cursor files, result_set files (11 occurrences)
- Add AWS_ATHENA_MANAGED_WORKGROUP env var to tests/Env, GitHub Actions, docs
- Add ManagedWorkGroup resource to CloudFormation template

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fetching results does not work with Athena managed query result storage

1 participant

Comments