The Physionet 2017 Challenge uses the F1 evaluation criteria,but the code uses the accuracy, it is ok?