I just tried running the inference pipeline on a larger system for the first time, and noticed that the inference pipeline is completely CPU bound (i.e. the GPU idles for the vast majority of the inference).
Inference on our larger GPU systems is 6x slower than on a typical desktop. There's clearly a problem with the dask-distributed configuration.