From 305deb8d7be5313ddcb1fe88ef66e5af87435af6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ciro=20Santilli=20=E5=85=AD=E5=9B=9B=E4=BA=8B=E4=BB=B6=20?= =?UTF-8?q?=E6=B3=95=E8=BD=AE=E5=8A=9F?= Date: Mon, 5 Aug 2019 00:00:00 +0000 Subject: [PATCH] 71735a3a15515a56e0fe50393f69871e20aad3e4 --- index.html | 272 +++++++++++++++++++++++++++++++++++------------------ 1 file changed, 180 insertions(+), 92 deletions(-) diff --git a/index.html b/index.html index 61617e0..0084a5d 100644 --- a/index.html +++ b/index.html @@ -1126,60 +1126,63 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 18.6. Pass extra options to gem5
  • -
  • 18.7. gem5 exit after a number of instructions
  • -
  • 18.8. m5ops +
  • 18.7. m5ops
  • -
  • 18.9. gem5 arm Linux kernel patches +
  • 18.8. gem5 arm Linux kernel patches
  • -
  • 18.10. m5out directory +
  • 18.9. m5out directory
  • -
  • 18.11. m5term
  • -
  • 18.12. gem5 Python scripts without rebuild
  • -
  • 18.13. gem5 fs_bigLITTLE
  • -
  • 18.14. gem5 unit tests
  • -
  • 18.15. gem5 simulate() limit reached
  • -
  • 18.16. gem5 build options +
  • 18.10. m5term
  • +
  • 18.11. gem5 Python scripts without rebuild
  • +
  • 18.12. gem5 fs_bigLITTLE
  • +
  • 18.13. gem5 unit tests
  • +
  • 18.14. gem5 simulate() limit reached
  • +
  • 18.15. gem5 build options
  • 19. Buildroot @@ -17369,7 +17401,7 @@ cat out/gem5-bench-dhrystone.txt
    -

    To find out why your program is slow, a good first step is to have a look at gem5 stats.txt file.

    +

    To find out why your program is slow, a good first step is to have a look at the gem5 m5out/stats.txt file.

    18.2.1. Skip extra benchmark instructions

    @@ -17939,7 +17971,7 @@ xdg-open bst_vs_heap_vs_hashmap_gem5.tmp.png

    The cache sizes were chosen to match the host P51 to improve the comparison. Ideally we should also use the same standard library.

    -

    Note that this will take a long time, and will produce a humongous ~40Gb stats file as explained at: Section 18.10.2.1, “gem5 only dump selected stats”

    +

    Note that this will take a long time, and will produce a humongous ~40Gb stats file as explained at: Section 18.9.2.1, “gem5 only dump selected stats”

    Sources:

    @@ -18645,7 +18677,23 @@ expect eof

    18.6. Pass extra options to gem5

    -

    Pass options to the fs.py script:

    +

    Remember that in the gem5 command line, we can either pass options to the script being run as in:

    +
    +
    +
    +
    build/X86/gem5.opt configs/examples/fs.py --some-option
    +
    +
    +
    +

    or to the gem5 executable itself:

    +
    +
    +
    +
    build/X86/gem5.opt --some-option configs/examples/fs.py
    +
    +
    +
    +

    Pass options to the script in our setup use:

      @@ -18668,7 +18716,7 @@ expect eof
    -

    Pass options to the gem5 executable itself:

    +

    To pass options to the gem5 executable we expose the --gem5-exe-args option:

      @@ -18684,32 +18732,7 @@ expect eof
    -

    18.7. gem5 exit after a number of instructions

    -
    -

    Quit the simulation after 1024 instructions:

    -
    -
    -
    -
    ./run --emulator gem5 -- -I 1024
    -
    -
    -
    -

    Can be nicely checked with gem5 tracing.

    -
    -
    -

    Cycles instead of instructions:

    -
    -
    -
    -
    ./run --emulator gem5 -- --memory 1024
    -
    -
    -
    -

    Otherwise the simulation runs forever by default.

    -
    -
    -
    -

    18.8. m5ops

    +

    18.7. m5ops

    m5ops are magic instructions which lead gem5 to do magic things, like quitting or dumping stats.

    @@ -18749,7 +18772,7 @@ expect eof
    -

    18.8.1. m5

    +

    18.7.1. m5

    m5 is a guest command line utility that is installed and run on the guest, that serves as a CLI front-end for the m5ops

    @@ -18760,7 +18783,7 @@ expect eof

    It is possible to guess what most tools do from the corresponding m5ops, but let’s at least document the less obvious ones here.

    -
    18.8.1.1. m5 exit
    +
    18.7.1.1. m5 exit

    End the simulation.

    @@ -18769,7 +18792,7 @@ expect eof
    -
    18.8.1.2. m5 fail
    +
    18.7.1.2. m5 fail

    End the simulation with a failure exit event:

    @@ -18808,7 +18831,7 @@ expect eof
    -
    18.8.1.3. m5 writefile
    +
    18.7.1.3. m5 writefile

    Send a guest file to the host. 9P is a more advanced alternative.

    @@ -18839,7 +18862,7 @@ m5 writefile myfileguest myfilehost
    -
    18.8.1.4. m5 readfile
    +
    18.7.1.4. m5 readfile

    Read a host file pointed to by the fs.py --script option to stdout.

    @@ -18867,7 +18890,7 @@ m5 writefile myfileguest myfilehost
    -
    18.8.1.5. m5 initparam
    +
    18.7.1.5. m5 initparam

    Ermm, just another m5 readfile that only takes integers and only from CLI options? Is this software so redundant?

    @@ -18893,7 +18916,7 @@ m5 writefile myfileguest myfilehost
    -
    18.8.1.6. m5 execfile
    +
    18.7.1.6. m5 execfile

    Trivial combination of m5 readfile + execute the script.

    @@ -18928,7 +18951,7 @@ m5 execfile
    -

    18.8.2. m5ops instructions

    +

    18.7.2. m5ops instructions

    gem5 allocates some magic instructions on unused instruction encodings for convenient guest instrumentation.

    @@ -19007,7 +19030,7 @@ m5 execfile
    -
    18.8.2.1. m5ops instructions interface
    +
    18.7.2.1. m5ops instructions interface

    Let’s study how m5 uses them:

    @@ -19121,7 +19144,7 @@ m5_fail(ints[1], ints[0]);
    -
    18.8.2.2. m5op annotations
    +
    18.7.2.2. m5op annotations

    include/gem5/asm/generic/m5ops.h also describes some annotation instructions.

    @@ -19132,7 +19155,7 @@ m5_fail(ints[1], ints[0]);
    -

    18.9. gem5 arm Linux kernel patches

    +

    18.8. gem5 arm Linux kernel patches

    https://gem5.googlesource.com/arm/linux/ contains an ARM Linux kernel forks with a few gem5 specific Linux kernel patches on top of mainline created by ARM Holdings on top of a few upstream kernel releases.

    @@ -19217,7 +19240,7 @@ git -C "$(./getvar linux_source_dir)" checkout -

    Tested on 649d06d6758cefd080d04dc47fd6a5a26a620874 + 1.

    -

    18.9.1. gem5 arm Linux kernel patches boot speedup

    +

    18.8.1. gem5 arm Linux kernel patches boot speedup

    We have observed that with the kernel patches, boot is 2x faster, falling from 1m40s to 50s.

    @@ -19235,7 +19258,7 @@ git -C "$(./getvar linux_source_dir)" checkout -
    -

    18.10. m5out directory

    +

    18.9. m5out directory

    When you run gem5, it generates an m5out directory at:

    @@ -19251,7 +19274,7 @@ git -C "$(./getvar linux_source_dir)" checkout -

    The files in that directory contains some very important information about the run, and you should become familiar with every one of them.

    -

    18.10.1. system.terminal

    +

    18.9.1. gem5 m5out/system.terminal file

    Contains UART output, both from the Linux kernel or from the baremetal system.

    @@ -19260,7 +19283,7 @@ git -C "$(./getvar linux_source_dir)" checkout -
    -

    18.10.2. gem5 stats.txt

    +

    18.9.2. gem5 m5out/stats.txt file

    This file contains important statistics about the run:

    @@ -19293,7 +19316,7 @@ system.cpu.dtb.inst_hits

    For x86, it is interesting to try and correlate numCycles with:

    -
    18.10.2.1. gem5 only dump selected stats
    +
    18.9.2.1. gem5 only dump selected stats

    TODO

    @@ -19306,9 +19329,9 @@ system.cpu.dtb.inst_hits
    -

    18.10.3. config.ini

    +

    18.9.3. gem5 config.ini

    -

    The config.ini file, contains a very good high level description of the system:

    +

    The m5out/config.ini file, contains a very good high level description of the system:

    @@ -19389,7 +19412,7 @@ clock=500
    -

    18.11. m5term

    +

    18.10. m5term

    We use the m5term in-tree executable to connect to the terminal instead of a direct telnet.

    @@ -19414,7 +19437,7 @@ clock=500
    -

    18.12. gem5 Python scripts without rebuild

    +

    18.11. gem5 Python scripts without rebuild

    We have made a crazy setup that allows you to just cd into submodules/gem5, and edit Python scripts directly there.

    @@ -19448,7 +19471,7 @@ clock=500
    -

    18.13. gem5 fs_bigLITTLE

    +

    18.12. gem5 fs_bigLITTLE

    By default, we use configs/example/fs.py script.

    @@ -19498,7 +19521,7 @@ clock=500
    -

    We setup 2 big and 2 small CPUs, but cat /proc/cpuinfo shows 4 identical CPUs instead of 2 of two different types, likely because gem5 does not expose some informational register much like the caches: https://www.mail-archive.com/gem5-users@gem5.org/msg15426.html config.ini does show that the two big ones are DerivO3CPU and the small ones are MinorCPU.

    +

    We setup 2 big and 2 small CPUs, but cat /proc/cpuinfo shows 4 identical CPUs instead of 2 of two different types, likely because gem5 does not expose some informational register much like the caches: https://www.mail-archive.com/gem5-users@gem5.org/msg15426.html gem5 config.ini does show that the two big ones are DerivO3CPU and the small ones are MinorCPU.

    TODO: why is the --dtb required despite fs_bigLITTLE.py having a DTB generation capability? Without it, nothing shows on terminal, and the simulation terminates with simulate() limit reached @ 18446744073709551615. The magic vmlinux.vexpress_gem5_v1.20170616 works however without a DTB.

    @@ -19508,7 +19531,7 @@ clock=500
    -

    18.15. gem5 simulate() limit reached

    +

    18.14. gem5 simulate() limit reached

    This error happens when the following instruction limits are reached:

    @@ -19628,6 +19651,21 @@ Exiting @ tick 3000 because all threads reached the max instruction count
    +

    The exact same can be achieved with the older hardcoded --maxinsts mechanism present in se.py and fs.py:

    +
    +
    +
    +
    ./run \
    +  --emulator gem5 \
    +  --static \
    +  --userland \userland/arch/x86_64/freestanding/linux/hello.S \
    +  --trace-insts-stdout \
    +  -- \
    +  --maxinsts 3
    +;
    +
    +
    +

    The message also shows on User mode simulation deadlocks, for example in userland/posix/pthread_deadlock.c:

    @@ -19691,12 +19729,12 @@ Exiting @ tick 18446744073709551615 because simulate() limit reached
    -

    18.16. gem5 build options

    +

    18.15. gem5 build options

    In order to use different build options, you might also want to use gem5 build variants to keep the build outputs separate from one another.

    -

    18.16.1. gem5 debug build

    +

    18.15.1. gem5 debug build

    The gem5.debug executable has optimizations turned off unlike the default gem5.opt, and provides a much better debug experience:

    @@ -19726,7 +19764,7 @@ Exiting @ tick 18446744073709551615 because simulate() limit reached
    -

    18.16.2. gem5 clang build

    +

    18.15.2. gem5 clang build

    TODO test properly, benchmark vs GCC.

    @@ -19739,7 +19777,7 @@ Exiting @ tick 18446744073709551615 because simulate() limit reached
    -

    18.16.3. gem5 sanitation build

    +

    18.15.3. gem5 sanitation build

    If there gem5 appears to have a C++ undefined behaviour bug, which is often very difficult to track down, you can try to build it with the following extra SCons options:

    @@ -19866,6 +19904,56 @@ qemu-system-aarch64 -M virt -cpu cortex-a57 -nographic -smp 1 -kernel output/ima
  • +
    +

    19.1.1. gem5 Ruby build

    +
    +

    Ruby is a system that includes the SLICC domain specific language to describe memory systems: http://gem5.org/Ruby

    +
    +
    +

    It seems to have usage outside of gem5, but the naming overload with the Ruby programming language, which also has domain specific languages as a concept, makes it impossible to google anything about it!

    +
    +
    +

    Ruby is activated at compile time with the PROTOCOL flag, which specifies the desired memory system time.

    +
    +
    +

    For example, to use a two level MESI cache coherence protocol, we can do:

    +
    +
    +
    +
    ./build-gem5 --arch aarch64 --gem5-build-id ruby -- PROTOCOL=MESI_Two_Level
    +
    +
    +
    +

    and during build we see a humongous line of type:

    +
    +
    +
    +
    [   SLICC] src/mem/protocol/MESI_Two_Level.slicc -> ARM/mem/protocol/AccessPermission.cc, ARM/mem/protocol/AccessPermission.hh, ...
    +
    +
    +
    +

    which shows that dozens of C++ files are being generated from Ruby SLICC.

    +
    +
    +

    TODO observe it doing something during a run.

    +
    +
    +

    The relevant source files live in the source tree under:

    +
    +
    +
    +
    src/mem/protocol/MESI_Two_Level*
    +
    +
    +
    +

    We already pass the SLICC_HTML flag by default to the build, which generates an HTML summary of each memory protocol under:

    +
    +
    +
    +
    xdg-open "$(./getvar --arch aarch64 --gem5-build-id ruby gem5_build_build_dir)/ARM/mem/protocol/html/index.html"
    +
    +
    +

    19.2. Custom Buildroot configs

    @@ -23372,7 +23460,7 @@ pop %rbp

    TODO: review this section, make a more controlled userland experiment with m5ops instrumentation.

    -

    Let’s have some fun and try to correlate the gem5 gem5 stats.txt system.cpu.numCycles cycle count with the x86 RDTSC instruction that is supposed to do the same thing:

    +

    Let’s have some fun and try to correlate the gem5 m5out/stats.txt file system.cpu.numCycles cycle count with the x86 RDTSC instruction that is supposed to do the same thing:

    @@ -28699,7 +28787,7 @@ less "$(./getvar --arch aarch64 --emulator gem5 --run-id 1 termout_file)"
  • gem5 automatically increments ports until it finds a free one.

    -

    gem5 60600f09c25255b3c8f72da7fb49100e2682093a does not seem to expose a way to set the terminal and VNC ports from fs.py, so we just let gem5 assign the ports itself, and use -n only to match what it assigned. Those ports both appear on config.ini.

    +

    gem5 60600f09c25255b3c8f72da7fb49100e2682093a does not seem to expose a way to set the terminal and VNC ports from fs.py, so we just let gem5 assign the ports itself, and use -n only to match what it assigned. Those ports both appear on gem5 config.ini.

    The GDB port can be assigned on gem5.opt --remote-gdb-port, but it does not appear on config.ini.