Problem
epf.py uses groupby epoch_id and time operations, for instance in QC and center_eeg.
The groupby operations are too slow for use on experiment sized datasets and need to be replaced, probably with numpy operations.
Solution
TBD. Centering is operations on floats, only need the numpy arrays
Maybe vectorize ... something like this pseudo code for center_eeg
- look up rows in each epoch in the centering interval
idxs = np.where((epochs.time >= start & epochs.time < stop))
- slice out the np array of (n_epochs * n_center_times, n_channels) for the centering interval
center_data = epochs[idxs]
- unstack/reshape the center_data 2D (n_epochs * n_center_times, n_eeg_streams) to 3D (n_epochs, n_center_times, n_eeg_streams)
- compute epoch mean across times (axis 1) = a 2D array of interval means (n_epochs, n_eeg_streams)
- np. repeat/tile/broacast the interval means for each epoch by the number of times per epoch to original dimensions (n_epochs * n_times, n_channels)
This gives a new 2D array (n_epochs * n_times, n_eeg_streams) where each epoch has the value of the mean in the centering interval for that epoch at that eeg_stream
center_mns = np.[tile?repeat?](center_data.reshape(?,?,?).mean(axis=1))
assert center_mns.shape == epochs[data_streams].shape
Centering the epochs by the mean of the centering interval is a one line subtraction
epochs[eeg_streams] = epochs[eeg_streams] - center_mns
Run %%timeit to see if this helps, if not find something that does.
Problem
epf.py uses groupby epoch_id and time operations, for instance in QC and center_eeg.
The groupby operations are too slow for use on experiment sized datasets and need to be replaced, probably with numpy operations.
Solution
TBD. Centering is operations on floats, only need the numpy arrays
Maybe vectorize ... something like this pseudo code for center_eeg
This gives a new 2D array (n_epochs * n_times, n_eeg_streams) where each epoch has the value of the mean in the centering interval for that epoch at that eeg_stream
Centering the epochs by the mean of the centering interval is a one line subtraction
Run
%%timeitto see if this helps, if not find something that does.