-
Notifications
You must be signed in to change notification settings - Fork 10
Description
From my experience I can say that we occasionally have to deal with missing data, e.g. when converting measurements of our Bruker system, which e.g. does not store the size of the delta sample, but just its volume. Our current approach is to then assign some non nonsensical value like e.g. 0x0x0 mm³ to the size. However, would it not be better to have a standardized and documented way of dealing with missing data?
In case where a whole data set is unknown or missing we could use empty data sets as proposed here https://docs.h5py.org/en/stable/high/dataset.html and here https://support.hdfgroup.org/HDF5/Tutor/crtdat.html.
In case where only single elements of a data set are unknown we could use NaN https://stackoverflow.com/questions/33656043/hdf5-how-to-handle-empty-rows.
I think former or later every group using the format will likely run into this issue and this could help keep the number of undocumented solutions at bay.