Model Selection

The classification code seems to be running "test" on every epoch and is printing the test accuracy.
Which of these accuracies do you report?
How is the model selection done?
 
I understand each experiment is repeated 5 times. But each time, is the last epoch accuracy considered for the mean or the max(accuracy at each epoch)?