I have been running CELLEX on a huge dataset with ~1.3kk cells recently. To my surprise, I encountered the following error:
MemoryError: Unable to allocate array with shape (1331984, 26182) and data type float64
Thus, the server has not enough memory to complete the task if the expression matrix is stored as float64 (by default). CELLEX consumes > 50% of RAM (more than 1 TB) and then the analysis inextricably stops.
2 developers: is it really necessary to store the expression matrix as float64? This super high precision is relevant? Are you sure that float32 is not sufficient?
2 users: I was able to solve the problem by converting my gene expression matrix (the variable data in the tutorial) from the default data type float64 to float32 before creating ESObject as follows
data_float32 = data.astype(np.float32)
Don’t forget to delete the variables after (we need to save the Yggdrasil’s RAM):
del data
del data_float32
I have been running CELLEX on a huge dataset with ~1.3kk cells recently. To my surprise, I encountered the following error:
MemoryError: Unable to allocate array with shape (1331984, 26182) and data type float64Thus, the server has not enough memory to complete the task if the expression matrix is stored as float64 (by default). CELLEX consumes > 50% of RAM (more than 1 TB) and then the analysis inextricably stops.
2 developers: is it really necessary to store the expression matrix as float64? This super high precision is relevant? Are you sure that float32 is not sufficient?
2 users: I was able to solve the problem by converting my gene expression matrix (the variable
datain the tutorial) from the default data type float64 to float32 before creatingESObjectas followsdata_float32 = data.astype(np.float32)Don’t forget to delete the variables after (we need to save the Yggdrasil’s RAM):