Experimental Protocols

There are many variations on how people structure experiments and a range of metrics used in comparison.

The results we present are for the moment the simplest: we do all training/validation on the default train split, and evaluate once on the test set.

There are alternatives: we could perform stratified resamples or cross validate.