Pandas dataframe used where numpy array is expected in in ArrayUtils.band_df

When a mask is not used for ArrayUtils there are instances where the input array calling that function is not properly the expected numpy array. The outcome is an ellipsis error due to the difference in indexing between pandas and numpy.

[ ... , bn ]

in Lyzenga method guide you have:
`x_train, x_test, y_train, y_test = train_test_split( df[imrds.band_names],df.depth,train_size=20000,random_state=5)`

which returns pandas dataframes for x_train, x_test, y_train, y_test

Moving further along these are passed to ArrayUtils.band_df as pandas dataframes. 
`traindf = ArrayUtils.band_df( x_train )`

Even though the function expects:
`imarr : np.array or np.ma.MaskedArray`

This was probably not noticed because in testing ArrayUtils.band_df a mask was always used, which would have run ArrayUtils.equalize_band_masks which doesn’t have indexing problems and returns:
`tuple of N np.ma.MaskedArray`

The error you get is:
`KeyError: (Ellipsis, 0)`
because you cannot subset pandas dataframe with [ ... , ]

my quick-fix was to do
` x_train = x_train.as_matrix()
    x_test = x_test.as_matrix()
    y_train = y_train.as_matrix()
    y_test = y_test.as_matrix()`

which seems to work, but I don't know if it will cause issues later on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pandas dataframe used where numpy array is expected in in ArrayUtils.band_df #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pandas dataframe used where numpy array is expected in in ArrayUtils.band_df #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions