Move to the more automated gem5-bench benchmarking script.

Enable everything in the toolchain in preparation to future benchmarking
to prevent future rebuilds, notably C++, Fortran and LTO support.

Document compiler optimizations for benchmarking.

Document graph-build for monitoring build times.
This commit is contained in:
Ciro Santilli
2018-02-24 05:27:03 +00:00
parent 8a5c310535
commit 42d86576cd
4 changed files with 66 additions and 12 deletions

View File

@@ -1533,24 +1533,24 @@ https://stackoverflow.com/questions/48944587/how-to-count-the-number-of-cpu-cloc
Let's benchmark https://en.wikipedia.org/wiki/Dhrystone[Dhrystone] which Buildroot provides:
....
./run -a arm -e 'init=/eval.sh - lkmc_eval="m5 checkpoint;m5 dumpstats;dhrystone 1000;m5 exit"' -g
./gem5-cycles
./gem5-bench dhrystone 1000
....
`./gem5-cycles` outputs the approximate number of CPU cycles it took Dhrystone to run. A few possible problems are:
This initial run generates a <<gem5-checkpoint,checkpoint>> after the kernel boots and before running the benchmark.
Then we can speed up further benchmark runs by skipping the Linux kernel boot:
....
./gem5-bench -r dhrystone 1000
....
These commands output the approximate number of CPU cycles it took Dhrystone to run. A few possible problems are:
* when we do `m5 dumpstats`, there is some time passed before the `exec` system call returns and the actual benchmark starts
* the benchmark outputs to stdout, which means so extra cycles in addition to the actual computation. But TODO: how to get the output to check that it is correct without such IO cycles?
Those problems should be insignificant if the benchmark runs for long enough however.
We can then speed up further benchmark runs by skipping the Linux kernel boot:
....
./run -a arm -e 'init=/eval.sh - lkmc_eval="m5 dumpstats;dhrystone 1000;m5 exit"' -g -- -r 1
./gem5-cycles
....
TODO: the cycle counts on the original run and the one with checkpoint restore differ slightly. Why? Multiple checkpoint restores give the same results however.
Now you can play a fun little game with your friends:
@@ -1575,6 +1575,26 @@ Each time we run `m5 dumpstats`, a section with the following format is added to
TODO: diff out all the stats, not just `system.cpu.numCycles`.
====== Enable compiler optimizations
If you are benchmarking compiled programs instead of hand written assembly, remember that we configure Buildroot to disable optimizations by default with:
....
BR2_OPTIMIZE_0=y
....
to improve the debugging experience.
You will likely want to change that to:
....
BR2_OPTIMIZE_3=y
....
and do a full rebuild.
TODO is it possible to compile a single package with optimizations enabled? In any case, this wouldn't be very representative, since calls to an unoptimized libc will also have an impact on performance. Kernel-wise it should be fine though, since the kernel requires `O=2`.
===== GEM5 kernel boot command line arguments
Analogous <<kernel-boot-command-line-arguments,to QEMU>>:
@@ -2231,6 +2251,19 @@ diff .config.olg .config
Copy and paste the diff additions to `buildroot_config_fragment`.
==== What is making my build so slow?
....
cd buildroot/output.x86_64~
make graph-build
xdg-open graphs/build.pie-packages.pdf
....
Our phylosophy is:
* if something adds little to the build time, build it in by default
* otherwise, make it optional
=== About
This project is for people who want to learn and modify low level system components:

View File

@@ -1,3 +1,7 @@
BR2_ENABLE_LOCALE=y
BR2_GCC_ENABLE_GRAPHITE=y
BR2_GCC_ENABLE_LTO=y
BR2_GCC_ENABLE_OPENMP=y
BR2_GLOBAL_PATCH_DIR="../global_patch_dir"
BR2_PACKAGE_BUSYBOX_CONFIG_FRAGMENT_FILES="../busybox_config_fragment"
BR2_PACKAGE_DHRYSTONE=y
@@ -12,6 +16,8 @@ BR2_ROOTFS_POST_IMAGE_SCRIPT="../rootfs_post_image_script"
BR2_ROOTFS_USERS_TABLES="../user_table"
BR2_TARGET_ROOTFS_CPIO=y
BR2_TARGET_ROOTFS_EXT2=y
BR2_TOOLCHAIN_BUILDROOT_CXX=y
BR2_TOOLCHAIN_BUILDROOT_FORTRAN=y
BR2_TOOLCHAIN_BUILDROOT_WCHAR=y
# Host GDB

17
gem5-bench Executable file
View File

@@ -0,0 +1,17 @@
#!/usr/bin/env bash
replay=false
while getopts r OPT; do
case "$OPT" in
r)
replay=true
;;
esac
done
shift "$(($OPTIND - 1))"
bench="$@"
if "$replay"; then
./run -a arm -e 'init=/eval.sh - lkmc_eval="m5 resetstats;'"$bench"';m5 exit"' -g -- -r 1
else
./run -a arm -e 'init=/eval.sh - lkmc_eval="m5 checkpoint;m5 resetstats;'"$bench"';m5 exit"' -g
fi
awk '/^system.cpu.numCycles /{ print $2 }' m5out/stats.txt

View File

@@ -1,2 +0,0 @@
#!/usr/bin/env bash
grep numCycles m5out/stats.txt | awk '{t0 = $2; getline; print $2 - t0; exit;}'