Skip to content

Improve mask performance by selecting only data within shape #36

@saschahofmann

Description

@saschahofmann

Is your feature request related to a problem? Please describe.

I think the performance of the mask function could be greatly improved by only looking at data within the .total_bounds of the provided shape. Personally, I would actually prefer if the returned data would contain only the selected area. E.g. imagine I want to get the country level data for a global dataset. Most of the returned data array will be null. If that behaviour is seen as unexpected for a mask function that maybe providing a clip function that does that would be awesome.

Describe the solution you'd like

Calling something like this before mask = shapes_to_mask(geodataframe, dataarray, **mask_kwargs) (spatial.py L299)

minx, miny, maxx, maxy = geodataframe.total_bounds
dataarray = dataarray.sel({lon_key:slice(minx, maxx), lat_key:slice(miny, maxy))})

should do the trick.

Describe alternatives you've considered

All of this can be easily achieved by calling the code above before calling the mask function, so its no deal breaker. But I would consider it an enhancement. As I said, in the description maybe this would fit better in a clip function than in mask

Additional context

No response

Organisation

Lobelia Earth

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions