Running time almost not affected by data size or GPU

Thank you for the great tool!

While testing it, I noticed that `python ECOLE_call.py` takes a very long time, so I have done some experiments, and have got the following statistics:
| Dataset | Sample | CPU Duration | GPU Duration | Speedup |
|---------|--------|-------------|-------------|---------|
| test | HG001_part | 56m 35s | 47m 46s | 1.18x |
| test | HG006_part | 59m 25s | 47m 14s | 1.26x |
| GIAB | HG001 | 53m 05s | 51m 16s | 1.04x |
| GIAB | HG002 | 1h 01m 10s | 50m 59s | 1.20x |
| GIAB | HG003 | 55m 55s | 48m 26s | 1.15x |
| GIAB | HG004 | 51m 32s | 48m 43s | 1.06x |
| GIAB | HG005 | 59m 00s | 50m 59s | 1.16x |
| GIAB | HG006 | 1h 00m 00s | 48m 39s | 1.23x |
| GIAB | HG007 | 57m 57s | 49m 15s | 1.18x |

`HG001_part` and `HG006_part` are subsets of the original data, which have about 2% of the reads.

I have two surprising observations:
1. Overall, GPU didn't speed up much than CPU
2. The subset data took almost the same time as the full data (`HG001_part` even took longer than `HG001` while using CPU)

Is this expected? If so, why is that happening?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running time almost not affected by data size or GPU #8

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Dataset	Sample	CPU Duration	GPU Duration	Speedup
test	HG001_part	56m 35s	47m 46s	1.18x
test	HG006_part	59m 25s	47m 14s	1.26x
GIAB	HG001	53m 05s	51m 16s	1.04x
GIAB	HG002	1h 01m 10s	50m 59s	1.20x
GIAB	HG003	55m 55s	48m 26s	1.15x
GIAB	HG004	51m 32s	48m 43s	1.06x
GIAB	HG005	59m 00s	50m 59s	1.16x
GIAB	HG006	1h 00m 00s	48m 39s	1.23x
GIAB	HG007	57m 57s	49m 15s	1.18x

Running time almost not affected by data size or GPU #8

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions