Skip to content

Make it possible to filter out all NaN values #65

@sferics

Description

@sferics

Is your feature request related to a problem? Please describe.

I tried to use the "filters" flag of the read_bufr function to filter out NaN values.
My filter was a very simple lambda function: filter = lambda x : pandas.notna(x)

When I used it to get rid of missing data of a single parameter, it worked fine. But as I took many parameters, the returned pandas DataFrame shrunk and did not contain the desired data anymore, or it was even empty.

I suspect that this is due to the nature of the filter conditions. In the documentation, you mention that they are connected with logical AND: https://pdbufr.readthedocs.io/en/latest/read_bufr.html#combining-conditions

The problem for me is that without filtering I get a quite big DataFrame with many missing values which I have to get rid of afterwards. I've noticed that a lot of columns actually just contain NaN values.

Describe the solution you'd like

It would be nice to have the option to connect conditions with logical OR instead. Maybe that could already solve my problem.

Describe alternatives you've considered

Another solution I can imagine is having the option to use the equivalent of "df.loc[:, parameter].notna().any()" on each column (parameter) before returning the DataFrame. If this condition returns True for a column, i.e., it only consists of missing values, the column gets dropped.

Ideally, this would be done before the DataFrame is created internally.

Additional context

My solution for now is that I call df.dropna(how="all") on both axis after I've created the DataFrame. But this is not a very efficient way to do it, especially for large amount of data.

Organisation

Meteo Service weather research

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions