The perf tool data collects a perf-archive as part of the tool's stop operation. This archive collection can result in a HUGE amount of repetitive data being collected.
It was observed in one run that a single perf.data.archive.bz2 file was 147 MB in size. With two nodes collecting perf tool data, across 20 iterations, with 3 samples in each iteration, the combined size was a whopping 17 GB (147 * 2 * 3 * 20 = 17,640 MB).
There are a few options we might want to consider:
- Adding a perf tool option to not capture
perf-archive output
- and turning off the capturing of the
perf-archive output by default
- Adding a way to allow the user to remove the
perf.data.archive.bz2 files after a pbench benchmark finishes, but before the user executes pbench-move-results
- This would allow a savvy user a chance to pick which iteration(s)/sample(s) they would want to keep, and for which nodes
- Another option would be to use multiple tool groups, collecting samples for each
- Perhaps
pbench-uperf --tool-groups=light,heavy --samples=4,1 would run 4 samples using the light tool group, and one sample using the heavy tool group, where the perf tool might be in the heavy but absent from the light
- In the example above, if each iteration had only 1 sample collecting using the
perf tool, then you'd have 5 GB (147 * 2 * 20 = 5,880 MB) of data collected
The
perftool data collects aperf-archiveas part of the tool'sstopoperation. This archive collection can result in a HUGE amount of repetitive data being collected.It was observed in one run that a single
perf.data.archive.bz2file was 147 MB in size. With two nodes collectingperftool data, across 20 iterations, with 3 samples in each iteration, the combined size was a whopping 17 GB (147 * 2 * 3 * 20 = 17,640 MB).There are a few options we might want to consider:
perf-archiveoutputperf-archiveoutput by defaultperf.data.archive.bz2files after a pbench benchmark finishes, but before the user executespbench-move-resultspbench-uperf --tool-groups=light,heavy --samples=4,1would run 4 samples using thelighttool group, and one sample using theheavytool group, where theperftool might be in theheavybut absent from thelightperftool, then you'd have 5 GB (147 * 2 * 20 = 5,880 MB) of data collected