-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Hey! I was excited to see this project today on LinkedIn. I'm also a fan of the EarthMover platform and their ERA5 dataset. Great work, awesome repo!
This is project is very similar to a demo that I've been fixing to make for some time, but never got around to: alxmrs/xarray-sql#45. I've started reading some of your sources, and I like your approach.
My repo, https://github.com/alxmrs/xarray-sql, is still in its early phase of development, but it I believe it fills a gap in the current standard set of PyData tools for working with weather data, namely that is is difficult to join tabular and raster data at scale. To illustrate the kind of operation that I am talking about, check out this demo notebook (the join in near the bottom).
(TL;DR: This joins monthly aggregates of ERA5 data with yearly energy production data in the Texas energy grid by treating the gridded raster data as a table).
I would love to see the Python REPL in the tools portion of the repo be equipped with an XQL context, e.g.:
import xarray_sql as xql
ctx = xql.XarrayContext()
ctx.from_dataset("era5", ds) # the dataset needs to be chunked somehow)
From their, your AI agent would be able to make queries and aggregations of ERA5 (or any Xarray dataset) via SQL.
Fair warning: my project is still in its early days and has sharp edges. But I believe that SQL is an interface that LLMs are well equipped to leverage to wield data well.