Variance between runs is way too high.
I guess we could wrap the function in a for loop and use the lowest measurement, that would certainly improve things.
But I think we better use performance counters instead.
Epiphany isn't affected by this since we already use CTIMERs there.
PAPI seems to have the cross-platform support we need:
http://icl.cs.utk.edu/papi/
Variance between runs is way too high.
I guess we could wrap the function in a for loop and use the lowest measurement, that would certainly improve things.
But I think we better use performance counters instead.
Epiphany isn't affected by this since we already use CTIMERs there.
PAPI seems to have the cross-platform support we need:
http://icl.cs.utk.edu/papi/