During processing the GPU is not fully utilized, even with 100 workers sending jobs to the cloud detection service. GPU was only at ~100W/300W, where in basic experiments I was able to fully utilize it with the same model and tensor size. Experiments from #11 have shown that batching does not increase processing speed because of the size of each tile.
Question is:
- is the GPU cloud detection a bottleneck and could we gain some performance here? Or is the service already pushing through patches as fast as possible?
- If so what is the issue here? Is it that the process has to wait for transform from/to CPU?
During processing the GPU is not fully utilized, even with 100 workers sending jobs to the cloud detection service. GPU was only at ~100W/300W, where in basic experiments I was able to fully utilize it with the same model and tensor size. Experiments from #11 have shown that batching does not increase processing speed because of the size of each tile.
Question is: