Bug
isp2025.hydrogen_demand_export() returns rows where isp_subregion contains multi-sentence footnote strings from the source Excel workbook (e.g. "This load is estimated based on ACIL Allen 2025..."), and rows where region = 'Region' (repeated header rows).
Impact
Downstream consumers must apply their own filter (e.g. WHERE region IN ('NSW','QLD','SA','TAS','VIC')) to get clean data. Without this, aggregation and dimension rendering are corrupted by footnote text appearing as dimension values.
Expected behaviour
hydrogen_demand_export() should return only valid data rows. Footnote rows and repeated header rows should be stripped during parsing (before the DataFrame is returned), consistent with how other read_timeseries-based functions behave.
Suggested fix
In _parse_timeseries_block (or specifically in hydrogen_demand_export), add a post-parse filter:
data = data.filter(pl.col("region").is_in(["NSW", "QLD", "SA", "TAS", "VIC"]))
Or more generically, filter rows where the first id column contains strings longer than a reasonable maximum for a region/subregion name.
Bug
isp2025.hydrogen_demand_export()returns rows whereisp_subregioncontains multi-sentence footnote strings from the source Excel workbook (e.g."This load is estimated based on ACIL Allen 2025..."), and rows whereregion = 'Region'(repeated header rows).Impact
Downstream consumers must apply their own filter (e.g.
WHERE region IN ('NSW','QLD','SA','TAS','VIC')) to get clean data. Without this, aggregation and dimension rendering are corrupted by footnote text appearing as dimension values.Expected behaviour
hydrogen_demand_export()should return only valid data rows. Footnote rows and repeated header rows should be stripped during parsing (before the DataFrame is returned), consistent with how otherread_timeseries-based functions behave.Suggested fix
In
_parse_timeseries_block(or specifically inhydrogen_demand_export), add a post-parse filter:Or more generically, filter rows where the first id column contains strings longer than a reasonable maximum for a region/subregion name.