Skip to content

NGFF perf testing #687

@will-moore

Description

@will-moore

Compare formats (on disk)

To compare the performance of NGFF data (ZarrReader) with other formats (both on disk), we want to compare NGFF version of the data alongside the same data in it's original format on the same server.

Choose some data to work with: idr0003 is not too big at 2.3G for a plate. Summary: (more details below):

  • Use bioformats2raw to convert a plate from idr0003 to NGFF.
  • zip, copy to idr-testing, unzip and perform regular import (not in-place)
  • Update Plate name and place it in idr0003 Screen
  • With the preview panel enabled, click on 25 Wells of both plates (original and NGFF copy), recording the times to render_image to load the initial plane. Plot the average of 25 Wells - Times in millisecs: Error bars are 1 std dev.

Screenshot 2024-03-05 at 11 14 59

Conclusion: NGFF is no slower (maybe faster)?

Compare disk vv s3

We want to test the performance of loading data from s3 compared with loading the same data from local disk.
Use idr0010 data since all plates are identical in terms of size etc:

  • Downloaded plate.ome.zarr.zip data previously uploaded to BioStudies
  • Unzip and place in /ngff dir on each idr-testing server
  • For a plate, replace the symlink from ManagedRepository -> mounted s3 directory with a symlink ManagedRepository -> /ngff/plate.ome.zarr
  • Compare performance loading initial plane for 25 Wells for the plate on disk with 25 Wells from an identical plate using s3 data. Times are in seconds: Std deviation is 0.267 for S3 and 0.096 for Disk (can't seem to plot different error bars on each column in Numbers)!

Screenshot 2024-03-05 at 11 43 10

Conclusion: Data access via S3 is slower than on disk:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    convert all data to NGFF

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions