Skip to content

Memory usage needs to be optimized #14

Description

@liubovpashkova

I have been running CELLEX on a huge dataset with ~1.3kk cells recently. To my surprise, I encountered the following error:

MemoryError: Unable to allocate array with shape (1331984, 26182) and data type float64

Thus, the server has not enough memory to complete the task if the expression matrix is stored as float64 (by default). CELLEX consumes > 50% of RAM (more than 1 TB) and then the analysis inextricably stops.

2 developers: is it really necessary to store the expression matrix as float64? This super high precision is relevant? Are you sure that float32 is not sufficient?

2 users: I was able to solve the problem by converting my gene expression matrix (the variable data in the tutorial) from the default data type float64 to float32 before creating ESObject as follows

data_float32 = data.astype(np.float32)

Don’t forget to delete the variables after (we need to save the Yggdrasil’s RAM):

del data
del data_float32

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions