Skip to content

Databricks Unity Catalog Metric View converter#152

Open
jackstein21 wants to merge 2 commits into
open-semantic-interchange:mainfrom
jackstein21:pr/databricks-converter
Open

Databricks Unity Catalog Metric View converter#152
jackstein21 wants to merge 2 commits into
open-semantic-interchange:mainfrom
jackstein21:pr/databricks-converter

Conversation

@jackstein21

Copy link
Copy Markdown

Summary

Adds a bidirectional Python converter between OSI YAML and Databricks Unity Catalog Metric View YAML (v1.1). This fills the Databricks spoke in the hub-and-spoke converter architecture — DATABRICKS is listed as a supported vendor in the spec and converters/index.md but had no converter implementation until now.

What's included

  • Import (Metric View → OSI): Parses Metric View YAML and produces a valid OSI document
  • Export (OSI → Metric View): Generates deployable Metric View YAML from OSI semantic models
  • CLI: osi-databricks import / osi-databricks export entry points
  • Tests: Unit tests + Hypothesis property-based tests for round-trip fidelity
  • Documentation: README with mapping table, usage examples, and known limitations

Mapping highlights

Metric View OSI
fields[].expr field.expression.dialects[DATABRICKS]
measures[].expr metric.expression.dialects[DATABRICKS]
joins[].on relationship.from_columns / to_columns
synonyms, display_name ai_context.synonyms
comment description
filter, materialization, window, format custom_extensions (vendor: DATABRICKS)

Dialect selection follows the converter guide: prefer DATABRICKS, fall back to ANSI_SQL.

Conventions followed

  • Python package at converters/databricks/ with src/osi_databricks/ layout
  • Hatchling build backend, matching existing Python converters (GoodData, dbt)
  • Depends on osi-python>=0.2.0.dev0 and PyYAML>=6.0
  • Tests use TPC-DS-based fixtures as baseline
  • Round-trip fidelity for both directions (MV→OSI→MV and OSI→MV→OSI)

Testing

cd converters/databricks
uv sync
uv run pytest

All tests pass locally on Python 3.14.

Notes
This is a tooling/converter contribution, not a spec change
The .gitignore commit adds standard entries (build artifacts, Hypothesis DB, IDE files) that benefit all contributors
Metric View materialization support is included but marked as preview (Databricks docs note it's in preview)

Add entries for Python build artifacts, Hypothesis test database,
IDE files, OS junk, and test coverage output.
Add bidirectional converter between OSI semantic model YAML and
Databricks Unity Catalog Metric View YAML (v1.1).

Import (Metric View → OSI):
- Maps fields to OSI fields with DATABRICKS dialect expressions
- Maps measures to OSI metrics
- Parses JOIN ON/USING clauses into OSI relationships
- Preserves synonyms, display_name, and comments as ai_context
- Stores Databricks-specific metadata (filter, materialization,
  window, format) in custom_extensions

Export (OSI → Metric View):
- Generates valid Metric View YAML per dataset
- Selects DATABRICKS dialect with ANSI_SQL fallback
- Reconstructs JOIN clauses from OSI relationships
- Restores filter and materialization from custom_extensions

Also includes:
- CLI entry point (osi-databricks import/export)
- Comprehensive test suite (unit + Hypothesis property tests)
- TPC-DS-based test fixtures
- README with mapping table and usage examples
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant