Thank you for the great tool!
While testing it, I noticed that python ECOLE_call.py takes a very long time, so I have done some experiments, and have got the following statistics:
| Dataset |
Sample |
CPU Duration |
GPU Duration |
Speedup |
| test |
HG001_part |
56m 35s |
47m 46s |
1.18x |
| test |
HG006_part |
59m 25s |
47m 14s |
1.26x |
| GIAB |
HG001 |
53m 05s |
51m 16s |
1.04x |
| GIAB |
HG002 |
1h 01m 10s |
50m 59s |
1.20x |
| GIAB |
HG003 |
55m 55s |
48m 26s |
1.15x |
| GIAB |
HG004 |
51m 32s |
48m 43s |
1.06x |
| GIAB |
HG005 |
59m 00s |
50m 59s |
1.16x |
| GIAB |
HG006 |
1h 00m 00s |
48m 39s |
1.23x |
| GIAB |
HG007 |
57m 57s |
49m 15s |
1.18x |
HG001_part and HG006_part are subsets of the original data, which have about 2% of the reads.
I have two surprising observations:
- Overall, GPU didn't speed up much than CPU
- The subset data took almost the same time as the full data (
HG001_part even took longer than HG001 while using CPU)
Is this expected? If so, why is that happening?
Thank you for the great tool!
While testing it, I noticed that
python ECOLE_call.pytakes a very long time, so I have done some experiments, and have got the following statistics:HG001_partandHG006_partare subsets of the original data, which have about 2% of the reads.I have two surprising observations:
HG001_parteven took longer thanHG001while using CPU)Is this expected? If so, why is that happening?