I am using pytorch's api in my python code to measure time for different layers of resnet152 to device(GPU, V-100).However, I cannot get a stable result.
Here is my code:
import torch.nn as nn
device = torch.device('cuda:3' if torch.cuda.is_available() else 'cpu')
model = torchvision.models.resnet152(pretrained=True)
def todevice(_model_, _device_=device):
T0 = time.perf_counter()
_model_.to(_device_)
torch.cuda.synchronize()
T1 = time.perf_counter()
print("model to device %s cost:%s ms" % (_device_, ((T1 - T0) * 1000)))
model1 = nn.Sequential(*list(resnet152.children())[:6])
todevice(model1)
When I use the code to test at different time, I can always get different answers, some of them are ridiculous, even to 200ms.
Also, there are 4 GPU in my lab, I don't know whether other extra GPUs will affect my result.
Could you tell me how to timing model.to(device) correctly?
I am using pytorch's api in my python code to measure time for different layers of resnet152 to device(GPU, V-100).However, I cannot get a stable result.
Here is my code:
When I use the code to test at different time, I can always get different answers, some of them are ridiculous, even to
200ms.Also, there are 4 GPU in my lab, I don't know whether other extra GPUs will affect my result.
Could you tell me how to timing
model.to(device)correctly?