Skip to content

Handle integer-backed Arrow decimals via logical type metadata#653

Open
rahul-iyer wants to merge 2 commits into
LadybugDB:mainfrom
rahul-iyer:issue-652-arrow-logical-type-info
Open

Handle integer-backed Arrow decimals via logical type metadata#653
rahul-iyer wants to merge 2 commits into
LadybugDB:mainfrom
rahul-iyer:issue-652-arrow-logical-type-info

Conversation

@rahul-iyer

Copy link
Copy Markdown
Contributor

Summary

ADBC exposes Arrow schemas and arrays, but some Snowflake types are not fully recoverable from the Arrow
physical type alone. In particular, Snowflake NUMBER/DECIMAL columns may arrive with their semantics
encoded in field metadata rather than standard Arrow decimal format.

The Snowflake ADBC driver documents two relevant metadata channels:

  • query-result field metadata such as logicalType=FIXED, precision, and scale
  • table-schema metadata such as DATA_TYPE=NUMBER(p,s)

Without interpreting those annotations, Ladybug can bind Snowflake decimals as plain numeric storage
types.

What changed

  • Added Snowflake-specific decoding for decimal types from:

    • logicalType=FIXED metadata on Arrow fields
    • DATA_TYPE=NUMBER(...), NUMERIC(...), and DECIMAL(...) metadata on Arrow fields
  • Added decimal scanning support for Snowflake FIXED values when the Arrow batch is:

    • integer-backed
    • float32-backed
    • float64-backed
  • Refactored Arrow metadata handling into separate decoder components:

    • shared metadata utilities
    • Snowflake decoder
    • generic metadata decoder
    • a single orchestrator entrypoint

Design

The refactor separates three concerns:

  • Arrow physical type parsing
  • logical type recovery from metadata
  • vendor-specific metadata decoding

The rest of the Arrow pipeline still consumes normalized logical type information only. This keeps
Snowflake-specific behavior out of the core binding and scan code paths except where the recovered
logical type is applied.

This structure is intended to make future support for other sources, such as Databricks/Spark-specific
metadata, straightforward to add as separate decoders rather than as scattered conditionals.

https://arrow.apache.org/adbc/current/driver/snowflake.html

Snowflake signal Example metadata Arrow physical storage Ladybug result Notes
Raw Snowflake decimal type via DATA_TYPE DATA_TYPE=NUMBER(12,4) Any Arrow storage type DECIMAL(12,4) Parsed from Snowflake table-schema metadata.
Raw Snowflake decimal type via DATA_TYPE with implicit scale DATA_TYPE=NUMBER(18) Any Arrow storage type DECIMAL(18,0) Missing scale defaults to 0.
Raw Snowflake decimal aliases via DATA_TYPE DATA_TYPE=NUMERIC(10,3) or DATA_TYPE=DECIMAL(10,3) Any Arrow storage type DECIMAL(10,3) Matching is case-insensitive and whitespace-tolerant.
Snowflake logical decimal metadata logicalType=FIXED, precision=7, scale=2 Integer-backed Arrow (INT8/16/32/64, UINT8/16/32/64) DECIMAL(7,2) Used for query-result Arrow schemas.
Snowflake logical decimal metadata logicalType=FIXED, precision=9, scale=2 Float-backed Arrow (FLOAT, DOUBLE) DECIMAL(9,2) Values are cast into decimal backing storage during scan.
Snowflake raw type fallback to logical metadata malformed DATA_TYPE plus valid logicalType=FIXED metadata Integer-backed or float-backed Arrow DECIMAL(p,s) from logicalType metadata If raw DATA_TYPE parsing fails, Snowflake logicalType parsing is tried next.
Snowflake metadata precedence over generic metadata DATA_TYPE=NUMBER(12,4) plus generic logicalType=DECIMAL, precision=9, scale=3 Any Arrow storage type DECIMAL(12,4) Snowflake raw type metadata wins over generic metadata.

@rahul-iyer rahul-iyer marked this pull request as draft July 4, 2026 10:01
Comment thread src/include/common/arrow/arrow_schema_metadata.h Outdated
@rahul-iyer rahul-iyer marked this pull request as ready for review July 4, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants