gem5.sh: simplify, and do m5 resetstats and m5 exit

This covers the most common use case of running a benchmark after restore.
This commit is contained in:
Ciro Santilli
2018-08-25 06:13:09 +01:00
parent 865d065d3d
commit eb4f4ba1d7
3 changed files with 45 additions and 45 deletions

View File

@@ -6876,7 +6876,7 @@ gem5 full system:
.... ....
printf 'm5 exit' > data/readfile printf 'm5 exit' > data/readfile
./run -a a -g -F '/gem5.sh' ./run -a a -g -F '/gem5.sh'
printf 'm5 resetstats;dhrystone 100000;m5 exit' > data/readfile printf 'dhrystone 100000' > data/readfile
time ./run -a a -l 1 -g time ./run -a a -l 1 -g
.... ....
@@ -7410,33 +7410,37 @@ OK, this is why we used gem5 in the first place, performance measurements!
Let's benchmark https://en.wikipedia.org/wiki/Dhrystone[Dhrystone] which Buildroot provides. Let's benchmark https://en.wikipedia.org/wiki/Dhrystone[Dhrystone] which Buildroot provides.
The most flexible way is to do: A flexible setup is:
.... ....
arch=aarch64 arch=aarch64
cmd="./run -a '$arch' -g -F '/gem5.sh'"
restore='-l 1 -- --cpu-type=HPI --restore-with-cpu=HPI --caches --l2cache --l1d_size=1024kB --l1i_size=1024kB --l2_size=1024kB --l3_size=1024kB'
# Generate a checkpoint after Linux boots. # Generate a checkpoint after Linux boots, using the faster and less detailed CPU.
# The boot takes a while, be patient young Padawan. # The boot takes a while, be patient young Padawan.
printf 'm5 exit' > data/readfile
./run -a "$arch" -g -F '/gem5.sh'
# Restore the most recent checkpoint taken, and run the benchmark
# with parameter 1.000. We skip the boot completely, saving time!
printf 'm5 resetstats;dhrystone 1000;m5 exit' > data/readfile
./run -a "$arch" -g -l 1
./gem5-stat -a "$arch"
# Now with another parameter 10.000.
printf 'm5 resetstats;dhrystone 10000;m5 exit' > data/readfile
./run -a "$arch" -g -l 1
./gem5-stat -a "$arch"
# Get an interactive shell at the end of the restore.
printf '' > data/readfile printf '' > data/readfile
./run -a "$arch" -g -l 1 eval "$cmd"
# Restore the most recent checkpoint taken with the more detailed and slower HPI CPU,
# and run the benchmark with parameter 1.000. We skip the boot completely, saving time!
printf 'dhrystone 1000' > data/readfile
eval "${cmd} ${restore}"
./gem5-stat -a "$arch"
# Now run again with another parameter 10.000.
# This one should take more cycles!
printf 'dhrystone 10000' > data/readfile
eval "${cmd} ${restore}"
./gem5-stat -a "$arch"
# Get an interactive shell at the end of the restore
# if you need to debug something more interactively.
printf 'sh' > data/readfile
eval "${cmd} ${restore}"
.... ....
The commands output the approximate number of CPU cycles it took Dhrystone to run. The `gem5-stats` commands output the approximate number of CPU cycles it took Dhrystone to run.
For more serious tests, you will likely want to automate logging the commands ran and results to files, a good example is: link:gem5-bench-cache[]. For more serious tests, you will likely want to automate logging the commands ran and results to files, a good example is: link:gem5-bench-cache[].
@@ -7448,20 +7452,6 @@ A more naive and simpler to understand approach would be a direct:
but the problem is that this method does not allow to easily run a different script without running the boot again, see: <<gem5-restore-new-scrip>> but the problem is that this method does not allow to easily run a different script without running the boot again, see: <<gem5-restore-new-scrip>>
A few imperfections of our benchmarking method are:
* when we do `m5 resetstats` and `m5 exit`, there is some time passed before the `exec` system call returns and the actual benchmark starts and ends
* the benchmark outputs to stdout, which means so extra cycles in addition to the actual computation. But TODO: how to get the output to check that it is correct without such IO cycles?
Solutions to these problems include:
* modify benchmark code with instrumentation directly, see <<m5ops-instructions>> for an example.
* monitor known addresses TODO possible? Create an example.
Discussion at: https://stackoverflow.com/questions/48944587/how-to-count-the-number-of-cpu-clock-cycles-between-the-start-and-end-of-a-bench/48944588#48944588
Those problems should be insignificant if the benchmark runs for long enough however.
Now you can play a fun little game with your friends: Now you can play a fun little game with your friends:
* pick a computational problem * pick a computational problem
@@ -7482,6 +7472,22 @@ Whenever we run `m5 dumpstats` or `m5 exit`, a section with the following format
---------- End Simulation Statistics ---------- ---------- End Simulation Statistics ----------
.... ....
==== Skip extra benchmark instructions
A few imperfections of our <<gem5-run-benchmark,benchmarking method>> are:
* when we do `m5 resetstats` and `m5 exit`, there is some time passed before the `exec` system call returns and the actual benchmark starts and ends
* the benchmark outputs to stdout, which means so extra cycles in addition to the actual computation. But TODO: how to get the output to check that it is correct without such IO cycles?
Solutions to these problems include:
* modify benchmark code with instrumentation directly, see <<m5ops-instructions>> for an example.
* monitor known addresses TODO possible? Create an example.
Discussion at: https://stackoverflow.com/questions/48944587/how-to-count-the-number-of-cpu-clock-cycles-between-the-start-and-end-of-a-bench/48944588#48944588
Those problems should be insignificant if the benchmark runs for long enough however.
==== gem5 system parameters ==== gem5 system parameters
Besides optimizing a program for a given CPU setup, chip developers can also do the inverse, and optimize the chip for a given benchmark! Besides optimizing a program for a given CPU setup, chip developers can also do the inverse, and optimize the chip for a given benchmark!

View File

@@ -83,14 +83,9 @@ fi
# Restore and run benchmarks. # Restore and run benchmarks.
rm -f "$results_file" rm -f "$results_file"
printf '#!/bin/sh
m5 resetstats
dhrystone XXX
m5 exit
' >"${common_gem5_readfile_file}"
for n in 1000 10000 100000; do for n in 1000 10000 100000; do
printf "n ${n}\n" >> "$results_file" printf "n ${n}\n" >> "$results_file"
sed -Ei "s/^dhrystone .*/dhrystone ${n}/" "${common_gem5_readfile_file}" printf "dhrystone ${n}" > "${common_gem5_readfile_file}"
bench-all bench-all
printf "\n" >> "$results_file" printf "\n" >> "$results_file"
done done

View File

@@ -1,7 +1,6 @@
#!/bin/sh #!/bin/sh
# This covers the most common setup to run a benchmark in gem5 and exit.
m5 checkpoint m5 checkpoint
script=/tmp/readfile m5 resetstats
m5 readfile > "$script" m5 readfile | sh
if [ -s "$script" ]; then m5 exit
sh "$script"
fi