From 02dbccd9e7183b33077104d6271164e4f62ec6c6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ciro=20Santilli=20=E5=85=AD=E5=9B=9B=E4=BA=8B=E4=BB=B6=20?= =?UTF-8?q?=E6=B3=95=E8=BD=AE=E5=8A=9F?= Date: Fri, 8 Nov 2019 23:00:01 +0000 Subject: [PATCH] 762cd8d601b7db06aa289c0fca7b40696299a868 --- index.html | 1322 +++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 1001 insertions(+), 321 deletions(-) diff --git a/index.html b/index.html index d7bb433..f267190 100644 --- a/index.html +++ b/index.html @@ -477,43 +477,44 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 1.1.3. About the QEMU Buildroot setup
  • -
  • 1.2. gem5 Buildroot setup +
  • 1.2. Dry run to get commands for your project
  • +
  • 1.3. gem5 Buildroot setup
  • -
  • 1.3. Docker host setup
  • -
  • 1.4. Prebuilt setup +
  • 1.4. Docker host setup
  • +
  • 1.5. Prebuilt setup
  • -
  • 1.5. Host kernel module setup +
  • 1.6. Host kernel module setup
  • -
  • 1.6. Userland setup +
  • 1.7. Userland setup
  • -
  • 1.7. Baremetal setup +
  • 1.8. Baremetal setup
  • -
  • 1.8. Build the documentation
  • +
  • 1.9. Build the documentation
  • 2. GDB step debug @@ -685,8 +686,7 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 10.7. QEMU user mode quirks @@ -1062,13 +1062,14 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 18.8.6. QEMU trace multicore
  • -
  • 18.8.7. gem5 tracing +
  • 18.8.7. QEMU get guest instruction count
  • +
  • 18.8.8. gem5 tracing
  • @@ -1740,17 +1741,22 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 29.2.1.2. gem5 x86_64 DerivO3CPU boot panics
  • -
  • 29.2.2. Benchmark builds +
  • 29.2.2. Benchmark emulators on userland executables @@ -1904,14 +1922,14 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b

    If you don’t know which one to go for, start with QEMU Buildroot setup getting started.

    -

    Design goals of this project are documented at: Section 32.18.1, “Design goals”.

    +

    Design goals of this project are documented at: Section 33.18.1, “Design goals”.

    1.1. QEMU Buildroot setup

    1.1.1. QEMU Buildroot setup getting started

    -

    This setup has been mostly tested on Ubuntu. For other host operating systems see: Section 32.1, “Supported hosts”. For greater stability, consider using the latest release instead of master: https://github.com/cirosantilli/linux-kernel-module-cheat/releases

    +

    This setup has been mostly tested on Ubuntu. For other host operating systems see: Section 33.1, “Supported hosts”. For greater stability, consider using the latest release instead of master: https://github.com/cirosantilli/linux-kernel-module-cheat/releases

    Reserve 12Gb of disk and run:

    @@ -1928,7 +1946,7 @@ cd linux-kernel-module-cheat

    You don’t need to clone recursively even though we have .git submodules: download-dependencies fetches just the submodules that you need for this build to save time.

    The initial build will take a while (30 minutes to 2 hours) to clone and build, see Benchmark builds for more details.

    @@ -2011,7 +2029,7 @@ hello2 cleanup
    -

    To avoid typing --arch aarch64 many times, you can set the default arch as explained at: Section 32.4, “Default command line arguments”

    +

    To avoid typing --arch aarch64 many times, you can set the default arch as explained at: Section 33.4, “Default command line arguments”

    I now urge you to read the following sections which contain widely applicable information:

    @@ -2315,7 +2333,7 @@ hello /root/.profile

    If you really want to develop semiconductors, your only choice is to join an university or a semiconductor company that has the EDA licenses.

    While hacking QEMU, you will likely want to GDB step its source. That is trivial since QEMU is just another userland program like any other, but our setup has a shortcut to make it even more convenient, see: Section 18.7, “Debug the emulator”.

    @@ -2622,9 +2640,81 @@ j = 0
    -

    1.2. gem5 Buildroot setup

    +

    1.2. Dry run to get commands for your project

    +
    +

    One of the major features of this repository is that we try to support the --dry-run option really well for all scripts.

    +
    +
    +

    This option, as the name suggests, outputs the external commands that would be run (or more precisely: equivalent commands), without actually running them.

    +
    +
    +

    This allows you to just clone this repository and get full working commands to integrate into your project, without having to build or use this setup further!

    +
    +
    +

    For example, we can obtain a QEMU run for the file userland/c/hello.c in User mode simulation by adding --dry-run to the normal command:

    +
    +
    +
    +
    ./run --dry-run --userland userland/c/hello.c
    +
    +
    +
    +

    which as of LKMC a18f28e263c91362519ef550150b5c9d75fa3679 + 1 outputs:

    +
    +
    +
    +
    + /path/to/linux-kernel-module-cheat/out/qemu/default/opt/x86_64-linux-user/qemu-x86_64 \
    +  -L /path/to/linux-kernel-module-cheat/out/buildroot/build/default/x86_64/target \
    +  -r 5.2.1 \
    +  -seed 0 \
    +  -trace enable=load_file,file=/path/to/linux-kernel-module-cheat/out/run/qemu/x86_64/0/trace.bin \
    +  -cpu max \
    +  /path/to/linux-kernel-module-cheat/out/userland/default/x86_64/c/hello.out \
    +;
    +
    +
    +
    +

    So observe that the command contains:

    +
    +
    +
      +
    • +

      +: sign to differentiate it from program stdout, much like bash -x output. This is not a valid part of the generated Bash command however.

      +
    • +
    • +

      the actual command nicely, indented and with arguments broken one per line, but with continuing backslashes so you can just copy paste into a terminal

      +
    • +
    • +

      ;: both a valid part of the Bash command, and a visual mark the end of the command

      +
    • +
    +
    +
    +

    For the specific case of running emulators such as QEMU, the last command is also automatically placed in a file for your convenience and later inspection:

    +
    +
    +
    +
    cat "$(./getvar run_dir)/run.sh"
    +
    +
    +
    +

    Furthermore, --dry-run also automatically specifies, in valid Bash shell syntax:

    +
    +
    +
      +
    • +

      environment variables used to run the command with syntax + ENV_VAR_1=abc ENV_VAR_2=def ./some/command

      +
    • +
    • +

      change in working directory with + cd /some/new/path && ./some/command

      +
    • +
    +
    +
    +
    +

    1.3. gem5 Buildroot setup

    -

    1.2.1. About the gem5 Buildroot setup

    +

    1.3.1. About the gem5 Buildroot setup

    This setup is like the QEMU Buildroot setup, but it uses gem5 instead of QEMU as a system simulator.

    @@ -2661,7 +2751,7 @@ j = 0
    -

    1.2.2. gem5 Buildroot setup getting started

    +

    1.3.2. gem5 Buildroot setup getting started

    For the most part, if you just add the --emulator gem5 option or *-gem5 suffix to all commands and everything should magically work.

    @@ -2749,12 +2839,12 @@ j = 0
    -

    1.3. Docker host setup

    +

    1.4. Docker host setup

    This repository has been tested inside clean Docker containers.

    -

    This is a good option if you are on a Linux host, but the native setup failed due to your weird host distribution, and you have better things to do with your life than to debug it. See also: Section 32.1, “Supported hosts”.

    +

    This is a good option if you are on a Linux host, but the native setup failed due to your weird host distribution, and you have better things to do with your life than to debug it. See also: Section 33.1, “Supported hosts”.

    For example, to do a QEMU Buildroot setup inside Docker, run:

    @@ -2898,9 +2988,9 @@ j = 0
    -

    1.4. Prebuilt setup

    +

    1.5. Prebuilt setup

    -

    1.4.1. About the prebuilt setup

    +

    1.5.1. About the prebuilt setup

    This setup uses prebuilt binaries that we upload to GitHub from time to time.

    @@ -2942,7 +3032,7 @@ j = 0
    -

    1.4.2. Prebuilt setup getting started

    +

    1.5.2. Prebuilt setup getting started

    Checkout to the latest tag and use the Ubuntu packaged QEMU to boot Linux:

    @@ -3058,7 +3148,7 @@ unzip lkmc-*.zip
    -

    1.5. Host kernel module setup

    +

    1.6. Host kernel module setup

    THIS IS DANGEROUS (AND FUN), YOU HAVE BEEN WARNED

    @@ -3163,7 +3253,7 @@ sudo lsmod | grep hello
    -

    1.5.1. Hello host

    +

    1.6.1. Hello host

    Minimal host build system example:

    @@ -3180,9 +3270,9 @@ dmesg
    -

    1.6. Userland setup

    +

    1.7. Userland setup

    -

    1.6.1. About the userland setup

    +

    1.7.1. About the userland setup

    In order to test the kernel and emulators, userland content in the form of executables and scripts is of course required, and we store it mostly under:

    @@ -3232,14 +3322,14 @@ dmesg
    -

    1.6.2. Userland setup getting started

    +

    1.7.2. Userland setup getting started

    There are several ways to run our Userland content, notably:

    -
    1.6.2.2. Userland setup getting started with prebuilt toolchain and QEMU user mode
    +
    1.7.2.2. Userland setup getting started with prebuilt toolchain and QEMU user mode

    If you are lazy to built the Buildroot toolchain and QEMU, but want to run e.g. ARM Userland assembly in User mode simulation, you can get away on Ubuntu 18.04 with just:

    @@ -3469,7 +3559,7 @@ cd userland
    -

    This present the usual trade-offs of using prebuilts as mentioned at: Section 1.4, “Prebuilt setup”.

    +

    This present the usual trade-offs of using prebuilts as mentioned at: Section 1.5, “Prebuilt setup”.

    Other functionality are analogous, e.g. testing:

    @@ -3502,7 +3592,7 @@ cd userland
    -

    1.7. Baremetal setup

    +

    1.8. Baremetal setup

    -

    1.7.1. About the baremetal setup

    +

    1.8.1. About the baremetal setup

    This setup does not use the Linux kernel nor Buildroot at all: it just runs your very own minimal OS.

    @@ -3574,7 +3664,7 @@ cd userland
    -

    1.7.2. Baremetal setup getting started

    +

    1.8.2. Baremetal setup getting started

    Every .c file inside baremetal/ and .S file inside baremetal/arch/<arch>/ generates a separate baremetal image.

    @@ -3785,7 +3875,7 @@ echo "$(./getvar --arch aarch64 --baremetal userland/c/hello.c --emulator gem5 -
    -

    1.8. Build the documentation

    +

    1.9. Build the documentation

    You don’t need to depend on GitHub.

    @@ -3832,7 +3922,7 @@ xdg-open README.html
    -

    More information about our documentation internals can be found at: Section 32.5, “Documentation”

    +

    More information about our documentation internals can be found at: Section 33.5, “Documentation”

    @@ -7115,7 +7205,7 @@ qw er

    ./run --userland path resolution is analogous to that of ./run --baremetal.

    -

    ./build user-mode-qemu first builds Buildroot, and then runs ./build-userland, which is further documented at: Section 1.6, “Userland setup”. It also builds QEMU. If you ahve already done a QEMU Buildroot setup previously, this will be very fast.

    +

    ./build user-mode-qemu first builds Buildroot, and then runs ./build-userland, which is further documented at: Section 1.7, “Userland setup”. It also builds QEMU. If you ahve already done a QEMU Buildroot setup previously, this will be very fast.

    If you modify the userland programs, rebuild simply with:

    @@ -7219,7 +7309,7 @@ qw er

    The gem5 tests require building statically with build id static, see also: Section 10.6, “gem5 syscall emulation mode”. TODO automate this better.

    -

    See: Section 32.13, “Test this repo” for more useful testing tips.

    +

    See: Section 33.13, “Test this repo” for more useful testing tips.

    @@ -7405,6 +7495,32 @@ qemu: uncaught target signal 6 (Aborted) - core dumped
  • +
    +

    Running statically linked executables sometimes makes things break:

    +
    +
    + +

    10.5.1. User mode static executables with dynamic libraries

    @@ -7609,93 +7725,7 @@ qemu-x86_64: /path/to/linux-kernel-module-cheat/submodules/qemu/accel/tcg/cpu-ex
    -

    10.6.3. User mode vs full system benchmark

    -
    -

    Let’s see if user mode runs considerably faster than full system or not.

    -
    -
    -

    First we build Dhrystone manually statically since dynamic linking is broken in gem5 as explained at: Section 10.6, “gem5 syscall emulation mode”.

    -
    -
    -

    TODO: move this section to our new custom dhrystone setup: Section 19.2.3.1, “Dhrystone”.

    -
    -
    -

    gem5 user mode:

    -
    -
    -
    -
    ./build-buildroot --arch arm --config 'BR2_PACKAGE_DHRYSTONE=y'
    -make \
    -  -B \
    -  -C "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2" \
    -  CC="$(./run-toolchain --arch arm --print-tool gcc)" \
    -  CFLAGS=-static \
    -;
    -time \
    -  ./run \
    -  --arch arm \
    -  --emulator gem5 \
    -  --userland "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2/dhrystone" \
    -  --userland-args 'asdf qwer' \
    -;
    -
    -
    -
    -

    gem5 full system:

    -
    -
    -
    -
    time \
    -  ./run \
    -  --arch arm \
    -  --eval-after './gem5.sh' \
    -  --emulator gem5
    -  --gem5-readfile 'dhrystone 100000' \
    -;
    -
    -
    -
    -

    QEMU user mode:

    -
    -
    -
    -
    time qemu-arm "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2/dhrystone" 100000000
    -
    -
    -
    -

    QEMU full system:

    -
    -
    -
    -
    time \
    -  ./run \
    -  --arch arm \
    -  --eval-after 'time dhrystone 100000000;./linux/poweroff.out' \
    -;
    -
    -
    -
    -

    Result on P51 at bad30f513c46c1b0995d3a10c0d9bc2a33dc4fa0:

    -
    -
    -
      -
    • -

      gem5 user: 33 seconds

      -
    • -
    • -

      gem5 full system: 51 seconds

      -
    • -
    • -

      QEMU user: 45 seconds

      -
    • -
    • -

      QEMU full system: 223 seconds

      -
    • -
    -
    -
    -
    -

    10.6.4. gem5 syscall emulation mode syscall tracing

    +

    10.6.3. gem5 syscall emulation mode syscall tracing

    Since gem5 has to implement syscalls itself in syscall emulation mode, it can of course clearly see which syscalls are being made, and we can log them for debug purposes with gem5 tracing, e.g.:

    @@ -7900,7 +7930,7 @@ hello
    -
    18.8.7.5. gem5 tracing internals
    +
    18.8.8.5. gem5 tracing internals

    As of gem5 16eeee5356585441a49d05c78abc328ef09f7ace the default tracer is ExeTracer. It is set at:

    @@ -17459,7 +17495,7 @@ root

    19. gem5

    19.1. gem5 vs QEMU

    @@ -18281,6 +18317,16 @@ m5 dumpstats
    +

    Open source but not in Buildroot:

    +
    +
    + +
    +

    There are not yet enabled, but it should be easy to so, see: Section 20.5, “Add new Buildroot packages”

    @@ -18305,7 +18351,7 @@ m5 dumpstats
    -

    Build and run on gem5 use mode:

    +

    Build and run on gem5 user mode:

    @@ -21508,7 +21554,7 @@ build/ARM/config/the_isa.hh

    The clean is necessary because the source files didn’t change, so make would just check the timestamps and not build anything.

    -

    You will then likely want to make those more permanent as explained at: Section 32.4, “Default command line arguments”.

    +

    You will then likely want to make those more permanent as explained at: Section 33.4, “Default command line arguments”.

    20.2.1. Enable Buildroot compiler optimizations

    @@ -21683,7 +21729,7 @@ make menuconfig

    If none of those methods are flexible enough for you, you can just fork or hack up buildroot_packages/sample_package the sample package to do what you want.

    -

    For how to use that package, see: Section 32.12.2, “buildroot_packages directory”.

    +

    For how to use that package, see: Section 33.12.2, “buildroot_packages directory”.

    Then iterate trying to do what you want and reading the manual until it works: https://buildroot.org/downloads/manual/manual.html

    @@ -21868,7 +21914,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -

    Then, you will also want to do a Bisection to pinpoint the exact commit to blame, and CC that developer.

    -

    Finally, give the images you used save upstream developers' time as shown at: Section 32.17.2, “release-zip”.

    +

    Finally, give the images you used save upstream developers' time as shown at: Section 33.17.2, “release-zip”.

    For Buildroot problems, you should wither provide the config you have:

    @@ -22176,7 +22222,7 @@ cd ../..

    This section documents our test and educational userland content, such as C, C++ and POSIX examples, present mostly under userland/.

    -

    Getting started at: Section 1.6, “Userland setup”

    +

    Getting started at: Section 1.7, “Userland setup”

    Userland assembly content is located at: Section 22, “Userland assembly”. It was split from this section basically because we were hitting the HTML h6 limit, stupid web :-)

    @@ -22185,7 +22231,7 @@ cd ../..

    This content makes up the bulk of the userland/ directory.

    -

    The quickest way to run the arch agnostic examples, which comprise the majority of the examples, is natively as shown at: Section 1.6.2.1, “Userland setup getting started natively”

    +

    The quickest way to run the arch agnostic examples, which comprise the majority of the examples, is natively as shown at: Section 1.7.2.1, “Userland setup getting started natively”

    This section was originally moved in here from: https://github.com/cirosantilli/cpp-cheat

    @@ -22981,7 +23027,7 @@ echo 1 > /proc/sys/vm/overcommit_memory
    -

    Like other userland programs, these programs can be run as explained at: Section 1.6, “Userland setup”.

    +

    Like other userland programs, these programs can be run as explained at: Section 1.7, “Userland setup”.

    As a quick reminder, the fastest setups to get started are:

    @@ -23514,7 +23560,7 @@ When instructions do not interpret this operand encoding as the zero register, u

    Userland assembly is generally simpler, and a pre-requisite for Baremetal setup.

    -

    System-land assembly cheats will be put under: Section 1.7, “Baremetal setup”.

    +

    System-land assembly cheats will be put under: Section 1.8, “Baremetal setup”.

    @@ -27893,7 +27939,7 @@ AArch64, see Procedure Call Standard for the ARM 64-bit Architecture.

    27. Baremetal

    27.1. Baremetal GDB step debug

    @@ -29795,7 +29841,7 @@ ISB

    In baremetal, we detect if tests failed by parsing logs for the Magic failure string.

    -

    See: Section 32.13, “Test this repo” for more useful testing tips.

    +

    See: Section 33.13, “Test this repo” for more useful testing tips.

    @@ -30276,6 +30322,7 @@ instructions 318486337 cmd ./run --arch arm --eval './linux/poweroff.out' time 6.62 exit_status 0 + cmd ./run --arch arm --eval './linux/poweroff.out' --trace exec_tb time 6.90 exit_status 0 @@ -30363,7 +30410,270 @@ instructions 124346081
    -

    29.2.2. Benchmark builds

    +

    29.2.2. Benchmark emulators on userland executables

    +
    +

    Let’s see how fast our simulators are running some well known or easy to understand userland benchmarks!

    +
    +
    +

    TODO: would be amazing to have an automated guest instructions per second count, but I’m not sure how to do that nicely for QEMU: QEMU get guest instruction count.

    +
    +
    +

    TODO: automate this further, produce the results table automatically, possibly by generalizing test-executables.

    +
    +
    +

    For now we can just run on gem5 to estimate the instruction count per input size and extrapolate?

    +
    +
    +

    For example, the simplest scalable CPU content would be a busy loop: userland/gcc/busy_loop.c, so let’s focus on that for now.

    +
    +
    +

    Summary of manually collected results on P51 at LKMC a18f28e263c91362519ef550150b5c9d75fa3679 + 1: Table 7, “Busy loop DMIPS for different simulator setups”. As expected, the less native / more detailed / more complex simulations are slower!

    +
    + + +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Table 7. Busy loop DMIPS for different simulator setups
    SimulatorLoopsTime (s)Instruction countApproximate MIPS

    qemu --arch aarch64

    10^10

    68

    1.1 * 10^11 (approx)

    2000

    gem5 --arch aarch64

    10^7

    100

    1.10018162 * 10^8

    1

    gem5 --arch aarch64 -- --cpu-type MinorCPU --caches

    10^6

    31

    1.1018152 * 10^7

    0.4

    gem5 --arch aarch64 -- --cpu-type DerivO3CPU --caches

    10^6

    52

    1.1018128 * 10^7

    0.2

    +
    +

    The first step is to determine a number of loops that will run long enough to have meaningful results, but not too long that we will get bored.

    +
    +
    +

    On our P51 machine, we found 10^7 (10 million == 1000 times 10000) loops to be a good number:

    +
    +
    +
    +
    ./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --userland-args '1000 10000' --static
    +./get-stat sim_insts
    +
    +
    +
    +

    as it gives:

    +
    +
    +
      +
    • +

      time: 00:01:40

      +
    • +
    • +

      instructions: 110018162 ~ 110 millions

      +
    • +
    +
    +
    +

    so ~ 110 million instructions / 100 seconds makes ~ 1 MIPS (million instructions per second).

    +
    +
    +

    This experiment also suggests that each loop is about 11 instructions long (110M instructions / 10M loops), so we look at the disassembly:

    +
    +
    +
    +
    ./run-toolchain --arch aarch64 gdb -- -batch -ex 'disas busy_loop' "$(./getvar --arch aarch64 userland_build_dir)/gcc/busy_loop.out"
    +
    +
    +
    +

    which contains:

    +
    +
    +
    +
    8       ) {
    +   0x0000000000400698 <+0>:     ff 83 00 d1     sub     sp, sp, #0x20
    +   0x000000000040069c <+4>:     e0 07 00 f9     str     x0, [sp, #8]
    +   0x00000000004006a0 <+8>:     e1 03 00 f9     str     x1, [sp]
    +
    +9           for (unsigned i = 0; i < max; i++) {
    +   0x00000000004006a4 <+12>:    ff 1f 00 b9     str     wzr, [sp, #28]
    +   0x00000000004006a8 <+16>:    11 00 00 14     b       0x4006ec <busy_loop+84>
    +
    +10              for (unsigned j = 0; j < max2; j++) {
    +   0x00000000004006ac <+20>:    ff 1b 00 b9     str     wzr, [sp, #24]
    +   0x00000000004006b0 <+24>:    08 00 00 14     b       0x4006d0 <busy_loop+56>
    +
    +11                  __asm__ __volatile__ ("" : "+g" (j), "+g" (j) : :);
    +   0x00000000004006b4 <+28>:    e1 1b 40 b9     ldr     w1, [sp, #24]
    +   0x00000000004006b8 <+32>:    e0 1b 40 b9     ldr     w0, [sp, #24]
    +   0x00000000004006bc <+36>:    e1 1b 00 b9     str     w1, [sp, #24]
    +   0x00000000004006c0 <+40>:    e0 17 00 b9     str     w0, [sp, #20]
    +
    +10              for (unsigned j = 0; j < max2; j++) {
    +   0x00000000004006c4 <+44>:    e0 17 40 b9     ldr     w0, [sp, #20]
    +   0x00000000004006c8 <+48>:    00 04 00 11     add     w0, w0, #0x1
    +   0x00000000004006cc <+52>:    e0 1b 00 b9     str     w0, [sp, #24]
    +   0x00000000004006d0 <+56>:    e0 1b 40 b9     ldr     w0, [sp, #24]
    +   0x00000000004006d4 <+60>:    e1 03 40 f9     ldr     x1, [sp]
    +   0x00000000004006d8 <+64>:    3f 00 00 eb     cmp     x1, x0
    +   0x00000000004006dc <+68>:    c8 fe ff 54     b.hi    0x4006b4 <busy_loop+28>  // b.pmore
    +
    +9           for (unsigned i = 0; i < max; i++) {
    +   0x00000000004006e0 <+72>:    e0 1f 40 b9     ldr     w0, [sp, #28]
    +   0x00000000004006e4 <+76>:    00 04 00 11     add     w0, w0, #0x1
    +   0x00000000004006e8 <+80>:    e0 1f 00 b9     str     w0, [sp, #28]
    +   0x00000000004006ec <+84>:    e0 1f 40 b9     ldr     w0, [sp, #28]
    +   0x00000000004006f0 <+88>:    e1 07 40 f9     ldr     x1, [sp, #8]
    +   0x00000000004006f4 <+92>:    3f 00 00 eb     cmp     x1, x0
    +   0x00000000004006f8 <+96>:    a8 fd ff 54     b.hi    0x4006ac <busy_loop+20>  // b.pmore
    +
    +12              }
    +13          }
    +14      }
    +   0x00000000004006fc <+100>:   1f 20 03 d5     nop
    +   0x0000000000400700 <+104>:   ff 83 00 91     add     sp, sp, #0x20
    +   0x0000000000400704 <+108>:   c0 03 5f d6     ret
    +
    +
    +
    +

    We look for the internal backwards jumps, and we find two:

    +
    +
    +
    +
       0x00000000004006dc <+68>:    c8 fe ff 54     b.hi    0x4006b4 <busy_loop+28>  // b.pmore
    +   0x00000000004006f8 <+96>:    a8 fd ff 54     b.hi    0x4006ac <busy_loop+20>  // b.pmore
    +
    +
    +
    +

    and so clearly the one at 0x4006dc happens first and jumps to a larger address than the other one, so the internal loop must be between 4006dc and 4006b4, which contains exactly 11 instructions! Bingo!

    +
    +
    +

    Oh my God, unoptimized code is so horrendously inefficient, even I can’t stand all those useless loads and stores to memory variables!!!

    +
    +
    +

    Then for QEMU, we experimentally turn the number of loops up to 10^10 loops (100000 100000), which contains an expected 11 * 10^10 instructions, and the runtime is 00:01:08, so we have 1.1 * 10^11 instruction / 68 seconds ~ 2 * 10^9 = 2000 MIPS!

    +
    +
    +

    We can then repeat the experiment for other gem5 CPUs to see how they compare.

    +
    +
    +
    29.2.2.1. User mode vs full system benchmark
    +
    +

    Let’s see if user mode runs considerably faster than full system or not, ignoring the kernel boot.

    +
    +
    +

    First we build Dhrystone manually statically since dynamic linking is broken in gem5 as explained at: Section 10.6, “gem5 syscall emulation mode”.

    +
    +
    +

    TODO: move this section to our new custom dhrystone setup: Section 19.2.3.1, “Dhrystone”.

    +
    +
    +

    gem5 user mode:

    +
    +
    +
    +
    ./build-buildroot --arch arm --config 'BR2_PACKAGE_DHRYSTONE=y'
    +make \
    +  -B \
    +  -C "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2" \
    +  CC="$(./run-toolchain --arch arm --print-tool gcc)" \
    +  CFLAGS=-static \
    +;
    +time \
    +  ./run \
    +  --arch arm \
    +  --emulator gem5 \
    +  --userland "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2/dhrystone" \
    +  --userland-args 'asdf qwer' \
    +;
    +
    +
    +
    +

    gem5 full system:

    +
    +
    +
    +
    time \
    +  ./run \
    +  --arch arm \
    +  --eval-after './gem5.sh' \
    +  --emulator gem5
    +  --gem5-readfile 'dhrystone 100000' \
    +;
    +
    +
    +
    +

    QEMU user mode:

    +
    +
    +
    +
    time qemu-arm "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2/dhrystone" 100000000
    +
    +
    +
    +

    QEMU full system:

    +
    +
    +
    +
    time \
    +  ./run \
    +  --arch arm \
    +  --eval-after 'time dhrystone 100000000;./linux/poweroff.out' \
    +;
    +
    +
    +
    +

    Result on P51 at bad30f513c46c1b0995d3a10c0d9bc2a33dc4fa0:

    +
    +
    +
      +
    • +

      gem5 user: 33 seconds

      +
    • +
    • +

      gem5 full system: 51 seconds

      +
    • +
    • +

      QEMU user: 45 seconds

      +
    • +
    • +

      QEMU full system: 223 seconds

      +
    • +
    +
    +
    +
    +
    +

    29.2.3. Benchmark builds

    The build times are calculated after doing ./configure and make source, which downloads the sources, and basically benchmarks the Internet.

    @@ -30388,7 +30698,7 @@ cat ../linux-kernel-module-cheat-regression/*/build-time.log
    -
    29.2.2.1. Find which Buildroot packages are making the build slow and big
    +
    29.2.3.1. Find which Buildroot packages are making the build slow and big
    ./build-buildroot -- graph-build graph-size graph-depends
    @@ -30399,14 +30709,14 @@ xdg-open graph-size.pdf
    -
    29.2.2.2. Benchmark Buildroot build baseline
    +
    29.2.3.2. Benchmark Buildroot build baseline

    This is the minimal build we could expect to get away with.

    @@ -30474,7 +30784,7 @@ xdg-open graph-size.pdf
    -
    29.2.2.3. Benchmark gem5 build
    +
    29.2.3.3. Benchmark gem5 build

    How long it takes to build gem5 itself.

    @@ -30494,7 +30804,7 @@ tail -n+1 ../linux-kernel-module-cheat-regression/*/gem5-bench-build-*.txt
    -
    29.2.2.3.1. Benchmark gem5 single file change rebuild time
    +
    29.2.3.3.1. Benchmark gem5 single file change rebuild time

    This is the critical development parameter, and is dominated by the link time of huge binaries.

    @@ -30712,10 +31022,371 @@ west build -b qemu_aarch64 samples/hello_world
    -

    32. About this repo

    +

    32. Computer architecture

    -

    32.1. Supported hosts

    +

    32.1. Cache coherence

    + +
    +

    Algorithms to keep the caches of different cores of a system coherent.

    +
    +
    +

    E.g.: if one processors writes to the cache, other processors have to know about it before they read from that address.

    +
    +
    +

    32.1.1. MSI protocol

    + +
    +

    This is the most basic non-trivial coherency protocol, and therefore the first one you should learn.

    +
    +
    +

    Helpful video: https://www.youtube.com/watch?v=gAUVAel-2Fg "MSI Coherence - Georgia Tech - HPCA: Part 5" by Udacity.

    +
    +
    +

    Let’s focus on a single cache line representing a given memory address.

    +
    +
    +

    The system looks like this:

    +
    +
    +
    +
    +----+
    +|DRAM|
    ++----+
    +^
    +|
    +v
    ++--------+
    +| BUS    |
    ++--------+
    +^        ^
    +|        |
    +v        v
    ++------+ +------+
    +|CACHE1| |CACHE2|
    ++------+ +------+
    +^        ^
    +|        |
    +|        |
    ++----+   +----+
    +|CPU1|   |CPU2|
    ++----+   +----+
    +
    +
    +
    +

    MSI stands for which states each cache can be in for a given cache line. The states are:

    +
    +
    +
      +
    • +

      Modified: a single cache has the valid data and it has been modified from DRAM.

      +
      +

      Both reads and writes are free, because we don’t have to worry about other processors.

      +
      +
    • +
    • +

      Shared: the data is synchronized with DRAM, and may be present in multiple caches.

      +
      +

      Reads are free, but writes need to do extra work.

      +
      +
      +

      This is the "most interesting" state of the protocol, as it allows for those free reads, even when multiple processors are using some address.

      +
      +
    • +
    • +

      Invalid: the cache does not have the data, CPU reads and writes need to do extra work

      +
    • +
    +
    +
    +

    The above allowed states can be summarized in the following table:

    +
    +
    +
    +
             CACHE1
    +         MSI
    +       M nny
    +CACHE2 S nyy
    +       I yyy
    +
    +
    +
    +

    The whole goal of the protocol is to maintain that state at all times, so that we can get those free reads when in shared state!

    +
    +
    +

    To do so, the caches have to pass messages between themselves! This means generating bus traffic, which has a cost and must be kept to a minimum.

    +
    +
    +

    The system components can receive and send the following messages:

    +
    +
    +
      +
    • +

      CPUn can send to CACHEn:

      +
      +
        +
      • +

        "Local read": CPU reads from cache

        +
      • +
      • +

        "Local write": CPU writes to cache

        +
      • +
      +
      +
    • +
    • +

      CACHEn to itself:

      +
      +
        +
      • +

        "Evict": the cache is running out of space due to another request

        +
      • +
      +
      +
    • +
    • +

      CACHEn can send the following message to the bus.

      +
      +
        +
      • +

        "Bus read": the cache needs to get the data. The reply will contain the full data line. It can come either from another cache that has the data, or from DRAM if none do.

        +
      • +
      • +

        "Bus write": the cache wants to modify some data, and it does not have the line.

        +
        +

        The reply must contain the full data line, because maybe the processor just wants to change one byte, but the line is much larger.

        +
        +
        +

        That’s why this request can also be called "Read Exclusive", as it is basically a "Bus Read" + "Invalidate" in one

        +
        +
      • +
      • +

        "Invalidate": the cache wants to modify some data, but it knows that all other caches are up to date, because it is in shared state.

        +
        +

        Therefore, it does not need to fetch the data, which saves bus traffic compared to "Bus write" since the data itself does not need to be sent.

        +
        +
      • +
      • +

        "Write back": send the data on the bus and tell someone to pick it up: either DRAM or another cache

        +
      • +
      +
      +
    • +
    +
    +
    +

    When a message is sent to the bus:

    +
    +
    +
      +
    • +

      all other caches and the DRAM will see it, this is called "snooping"

      +
    • +
    • +

      either caches or DRAM can reply if a reply is needed, but other caches get priority to reply earlier if they can, e.g. to serve a cache request from other caches rather than going all the way to DRAM

      +
    • +
    +
    +
    +

    When a cache receives a message, it do one or both of:

    +
    +
    +
      +
    • +

      change to another MSI state

      +
    • +
    • +

      send a message to the bus

      +
    • +
    +
    +
    +

    And finally, the transitions are:

    +
    +
    +
      +
    • +

      Modified:

      +
      +
        +
      • +

        "Local read": don’t need to do anything because only the current cache holds the data

        +
      • +
      • +

        "Local write": don’t need to do anything because only the current cache holds the data

        +
      • +
      • +

        "Evict": have to save data to DRAM so that our local modifications won’t be lost

        +
        +
          +
        • +

          Move to: Invalid

          +
        • +
        • +

          Send message: "Write back"

          +
        • +
        +
        +
      • +
      • +

        "Bus read": another cache is trying to read the address which we owned exclusively.

        +
        +

        Since we know what the latest data is, we can move to "Shared" rather than "Invalid" to possibly save time on future reads.

        +
        +
        +

        But to do that, we need to write the data back to DRAM to maintain the shared state consistent. The MESI protocol prevents that extra read in some cases.

        +
        +
        +

        And it has to be either: before the other cache gets its data from DRAM, or better, the other cache can get its data from our write back itself just like the DRAM.

        +
        +
        +
          +
        • +

          Move to: Shared

          +
        • +
        • +

          Send message: "Write back"

          +
        • +
        +
        +
      • +
      • +

        "Bus write": someone else will write to our address.

        +
        +

        We don’t know what they will write, so the best bet is to move to invalid.

        +
        +
        +

        Since the writer will become the new sole data owner, the writer can get the cache from us without going to DRAM at all! This is fine, because the writer will be the new sole owner of the line, so DRAM can remain dirty without problems.

        +
        +
        +
          +
        • +

          Move to: Invalid

          +
        • +
        • +

          Send message: "Write back"

          +
        • +
        +
        +
      • +
      +
      +
    • +
    • +

      Shared: TODO

      +
      +
        +
      • +

        "Local read":

        +
      • +
      • +

        "Local write":

        +
      • +
      • +

        "Evict":

        +
      • +
      • +

        "Bus read":

        +
      • +
      • +

        "Bus write":

        +
      • +
      +
      +
    • +
    • +

      Invalid: TODO

      +
      +
        +
      • +

        "Local read":

        +
      • +
      • +

        "Local write":

        +
      • +
      • +

        "Evict":

        +
      • +
      • +

        "Bus read":

        +
      • +
      • +

        "Bus write":

        +
      • +
      +
      +
    • +
    +
    +
    +

    TODO gem5 concrete example.

    +
    +
    +
    +

    32.1.2. MESI protocol

    + +
    +

    Splits the Shared of MSI protocol into a new Exclusive state:

    +
    +
    +
      +
    • +

      MESI Exclusive: clean but only present in one cache

      +
    • +
    • +

      MESI Shared: clean but may be present in more that one cache

      +
    • +
    +
    +
    +

    TODO advantage: I think the advantages over MSI are:

    +
    +
    +
      +
    • +

      when we move from Exclusive to Shared, no DRAM write back is needed, because we know that the cache is clean

      +
    • +
    • +

      when we move from Exclusive to Modified, no invalidate message is required, reducing bus traffic

      +
    • +
    +
    +
    +

    Exclusive is entered from Invalid after a "Local read", but only if the reply came from DRAM! If the reply came from another cache, we go directly to shared instead.

    +
    +
    +
    +

    32.1.3. MOSI protocol

    + +
    +

    TODO compare to MSI and understand advantages. From Wikipedia it seems that MOSI can get data from the Owned cache while MSI cannot get data from Shared caches and must go to memory, but why not? Why do we need that Owned? Is it because there are multiple Shared caches and them all replying at the same time would lead to problems?

    +
    +
    + +
    +
    +
    +
    +

    33. About this repo

    +
    +
    +

    33.1. Supported hosts

    The host requirements depend a lot on which examples you want to run.

    @@ -30764,9 +31435,9 @@ west build -b qemu_aarch64 samples/hello_world
    -

    32.2. Common build issues

    +

    33.2. Common build issues

    -

    32.2.1. You must put some 'source' URIs in your sources.list

    +

    33.2.1. You must put some 'source' URIs in your sources.list

    If ./build --download-dependencies fails with:

    @@ -30780,7 +31451,7 @@ west build -b qemu_aarch64 samples/hello_world
    -

    32.2.2. Build from downloaded source zip files

    +

    33.2.2. Build from downloaded source zip files

    It does not work if you just download the .zip with the sources for this repository from GitHub because we use Git submodules, you must clone this repo.

    @@ -30790,7 +31461,7 @@ west build -b qemu_aarch64 samples/hello_world
    -

    32.3. Run command after boot

    +

    33.3. Run command after boot

    If you just want to run a command after boot ends without thinking much about it, just use the --eval-after option, e.g.:

    @@ -30807,7 +31478,7 @@ west build -b qemu_aarch64 samples/hello_world
    -

    32.4. Default command line arguments

    +

    33.4. Default command line arguments

    It gets annoying to retype --arch aarch64 for every single command, or to remember --config setups.

    @@ -30852,12 +31523,12 @@ west build -b qemu_aarch64 samples/hello_world
    -

    32.5. Documentation

    +

    33.5. Documentation

    -

    To learn how to build the documentation see: Section 1.8, “Build the documentation”.

    +

    To learn how to build the documentation see: Section 1.9, “Build the documentation”.

    -

    32.5.1. Documentation verification

    +

    33.5.1. Documentation verification

    When running build-doc, we do the following checks:

    @@ -30878,7 +31549,7 @@ west build -b qemu_aarch64 samples/hello_world

    The scripts prints what you have to fix and exits with an error status if there are any errors.

    - + @@ -30901,7 +31572,7 @@ west build -b qemu_aarch64 samples/hello_world
    -
    32.5.1.2. asciidoctor/extract-header-ids
    +
    33.5.1.2. asciidoctor/extract-header-ids

    Documentation for asciidoctor/extract-header-ids

    @@ -30946,7 +31617,7 @@ explicitly-given
    - +

    The Asciidoctor extension scripts:

    @@ -30974,7 +31645,7 @@ explicitly-given
    -

    32.6.1. GitHub pages

    +

    33.6.1. GitHub pages

    As mentioned before the TOC, we have to push this README to GitHub pages due to: https://github.com/isaacs/github/issues/1610

    @@ -31024,7 +31695,7 @@ explicitly-given
    -

    32.7. Clean the build

    +

    33.7. Clean the build

    You did something crazy, and nothing seems to work anymore?

    @@ -31088,7 +31759,7 @@ ls "$(./getvar buildroot_build_dir)"
    -

    32.8. ccache

    +

    33.8. ccache

    ccache might save you a lot of re-build when you decide to Clean the build or create a new build variant.

    @@ -31157,7 +31828,7 @@ export CCACHE_MAXSIZE="20G"
    -

    32.9. Rebuild Buildroot while running

    +

    33.9. Rebuild Buildroot while running

    It is not possible to rebuild the root filesystem while running QEMU because QEMU holds the file qcow2 file:

    @@ -31168,7 +31839,7 @@ export CCACHE_MAXSIZE="20G"
    -

    32.10. Simultaneous runs

    +

    33.10. Simultaneous runs

    When doing long simulations sweeping across multiple system parameters, it becomes fundamental to do multiple simulations in parallel.

    @@ -31264,7 +31935,7 @@ less "$(./getvar --arch aarch64 --emulator gem5 --run-id 1 termout_file)"
    -

    To run multiple gem5 checkouts, see: Section 32.11.3.1, “gem5 worktree”.

    +

    To run multiple gem5 checkouts, see: Section 33.11.3.1, “gem5 worktree”.

    Implementation note: we create multiple namespaces for two things:

    @@ -31303,7 +31974,7 @@ less "$(./getvar --arch aarch64 --emulator gem5 --run-id 1 termout_file)"
    -

    32.11. Build variants

    +

    33.11. Build variants

    It often happens that you are comparing two versions of the build, a good and a bad one, and trying to figure out why the bad one is bad.

    @@ -31311,7 +31982,7 @@ less "$(./getvar --arch aarch64 --emulator gem5 --run-id 1 termout_file)"

    Our build variants system allows you to keep multiple built versions of all major components, so that you can easily switching between running one or the other.

    -

    32.11.1. Linux kernel build variants

    +

    33.11.1. Linux kernel build variants

    If you want to keep two builds around, one for the latest Linux version, and the other for Linux v4.16:

    @@ -31347,11 +32018,11 @@ git -C "$(./getvar linux_source_dir)" checkout -
    -

    To run both kernels simultaneously, one on each QEMU instance, see: Section 32.10, “Simultaneous runs”.

    +

    To run both kernels simultaneously, one on each QEMU instance, see: Section 33.10, “Simultaneous runs”.

    -

    32.11.2. QEMU build variants

    +

    33.11.2. QEMU build variants

    Analogous to the Linux kernel build variants but with the --qemu-build-id option instead:

    @@ -31367,7 +32038,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -
    -

    32.11.3. gem5 build variants

    +

    33.11.3. gem5 build variants

    Analogous to the Linux kernel build variants but with the --gem5-build-id option instead:

    @@ -31398,7 +32069,7 @@ git -C "$(./getvar gem5_source_dir)" checkout some-branch

    Therefore, you can’t forget to checkout to the sources to that of the corresponding build before running, unless you explicitly tell gem5 to use a non-default source tree with gem5 worktree. This becomes inevitable when you want to launch multiple simultaneous runs at different checkouts.

    -
    32.11.3.1. gem5 worktree
    +
    33.11.3.1. gem5 worktree

    --gem5-build-id goes a long way, but if you want to seamlessly switch between two gem5 tress without checking out multiple times, then --gem5-worktree is for you.

    @@ -31451,7 +32122,7 @@ cd -
    -
    32.11.3.2. gem5 private source trees
    +
    33.11.3.2. gem5 private source trees

    Suppose that you are working on a private fork of gem5, but you want to use this repository to develop it as well.

    @@ -31495,7 +32166,7 @@ gem5_internal="$(pwd)/gem5-internal"
    -

    32.11.4. Buildroot build variants

    +

    33.11.4. Buildroot build variants

    Allows you to have multiple versions of the GCC toolchain or root filesystem.

    @@ -31515,9 +32186,9 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12. Directory structure

    +

    33.12. Directory structure

    -

    32.12.1. lkmc directory

    +

    33.12.1. lkmc directory

    lkmc/ contains sources and headers that are shared across kernel modules, userland and baremetal examples.

    @@ -31528,7 +32199,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -

    Another option would have been to name it as includes/lkmc, but that would make paths longer, and we might want to store source code in that directory as well in the future.

    -
    32.12.1.1. Userland objects vs header-only
    +
    33.12.1.1. Userland objects vs header-only

    When factoring out functionality across userland examples, there are two main options:

    @@ -31587,7 +32258,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12.2. buildroot_packages directory

    +

    33.12.2. buildroot_packages directory

    @@ -31636,7 +32307,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -

    A custom build script can give you more flexibility: e.g. the package can be made work with other root filesystems more easily, have better 9P support, and rebuild faster as it evades some Buildroot boilerplate.

    -
    32.12.2.1. kernel_modules buildroot package
    +
    33.12.2.1. kernel_modules buildroot package
    @@ -31683,9 +32354,9 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12.3. patches directory

    +

    33.12.3. patches directory

    -
    32.12.3.1. patches/global directory
    +
    33.12.3.1. patches/global directory

    Has the following structure:

    @@ -31702,7 +32373,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -
    32.12.3.2. patches/manual directory
    +
    33.12.3.2. patches/manual directory

    Patches in this directory are never applied automatically: it is up to users to manually apply them before usage following the instructions in this documentation.

    @@ -31712,7 +32383,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12.4. rootfs_overlay

    +

    33.12.4. rootfs_overlay

    Source: rootfs_overlay.

    @@ -31759,7 +32430,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -

    This way you can just hack away the scripts and try them out immediately without any further operations.

    -
    32.12.4.1. out_rootfs_overlay_dir
    +
    33.12.4.1. out_rootfs_overlay_dir

    This path can be found with:

    @@ -31793,7 +32464,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12.5. lkmc.c

    +

    33.12.5. lkmc.c

    The files:

    @@ -31823,7 +32494,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12.6. rand_check.out

    +

    33.12.6. rand_check.out

    Print out several parameters that normally change randomly from boot to boot:

    @@ -31850,7 +32521,7 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.12.7. lkmc_home

    +

    33.12.7. lkmc_home

    lkmc_home refers to the target base directory in which we put all our custom built stuff, such as userland executables and kernel modules.

    @@ -31884,9 +32555,9 @@ git -C "$(./getvar buildroot_source_dir)" checkout -
    -

    32.13. Test this repo

    +

    33.13. Test this repo

    -

    32.13.1. Automated tests

    +

    33.13.1. Automated tests

    Run almost all tests:

    @@ -31942,7 +32613,7 @@ echo $?

    test does not all possible tests, because there are too many possible variations and that would take forever. The rationale is the same as for ./build all and is explained in ./build --help.

    -
    32.13.1.1. Test arch and emulator selection
    +
    33.13.1.1. Test arch and emulator selection

    You can select multiple archs and emulators of interest, as for an other command, with:

    @@ -31975,7 +32646,7 @@ echo $?
    -
    32.13.1.2. Quit on fail
    +
    33.13.1.2. Quit on fail

    By default, continue running even after the first failure happens, and they show a summary at the end.

    @@ -31989,7 +32660,7 @@ echo $?
    -
    32.13.1.3. Test userland in full system
    +
    33.13.1.3. Test userland in full system

    TODO: we really need a mechanism to automatically generate the test list automatically e.g. based on path_properties, currently there are many tests missing, and we have to add everything manually which is very annoying.

    @@ -32018,7 +32689,7 @@ echo $?
    -
    32.13.1.4. GDB tests
    +
    33.13.1.4. GDB tests

    We have some pexpect automated tests for GDB for both userland and baremetal programs!

    @@ -32091,7 +32762,7 @@ echo $?
    -
    32.13.1.5. Magic failure string
    +
    33.13.1.5. Magic failure string

    We do not know of any way to set the emulator exit status in QEMU arm full system.

    @@ -32194,9 +32865,9 @@ echo $?
    -

    32.13.2. Non-automated tests

    +

    33.13.2. Non-automated tests

    -
    32.13.2.1. Test GDB Linux kernel
    +
    33.13.2.1. Test GDB Linux kernel

    For the Linux kernel, do the following manual tests for now.

    @@ -32234,7 +32905,7 @@ echo $?
    -
    32.13.2.2. Test the Internet
    +
    33.13.2.2. Test the Internet

    You should also test that the Internet works:

    @@ -32245,7 +32916,7 @@ echo $?
    -
    32.13.2.3. CLI script tests
    +
    33.13.2.3. CLI script tests

    build-userland and test-executables have a wide variety of target selection modes, and it was hard to keep them all working without some tests:

    @@ -32263,7 +32934,7 @@ echo $?
    -

    32.14. Bisection

    +

    33.14. Bisection

    When updating the Linux kernel, QEMU and gem5, things sometimes break.

    @@ -32319,7 +32990,7 @@ git submodule update
    -

    32.15. path_properties

    +

    33.15. path_properties

    In order to build and run each userland and baremetal example properly, we need per-file metadata such as compiler flags and required number of cores.

    @@ -32362,7 +33033,7 @@ git submodule update
    -

    32.16. Update a forked submodule

    +

    33.16. Update a forked submodule

    This is a template update procedure for submodules for which we have some patches on on top of mainline.

    @@ -32391,9 +33062,9 @@ git commit -m "linux: update to ${next_mainline_revision}"
    -

    32.17. Release

    +

    33.17. Release

    -

    32.17.1. Release procedure

    +

    33.17.1. Release procedure

    Ensure that the Automated tests are passing on a clean build:

    @@ -32404,7 +33075,7 @@ git commit -m "linux: update to ${next_mainline_revision}"
    -

    The ./build-test command builds a superset of what will be downloaded which also tests other things we would like to be working on the release. For the minimal build to generate the files to be uploaded, see: Section 32.17.2, “release-zip”

    +

    The ./build-test command builds a superset of what will be downloaded which also tests other things we would like to be working on the release. For the minimal build to generate the files to be uploaded, see: Section 33.17.2, “release-zip”

    The clean build is necessary as it generates clean images since it is not possible to remove Buildroot packages

    @@ -32474,7 +33145,7 @@ git push --follow-tags
    -

    32.17.2. release-zip

    +

    33.17.2. release-zip

    Create a zip containing all files required for Prebuilt setup:

    @@ -32499,7 +33170,7 @@ git push --follow-tags
    -

    32.17.3. release-upload

    +

    33.17.3. release-upload

    After:

    @@ -32547,9 +33218,9 @@ git push --follow-tags
    -

    32.18. Design rationale

    +

    33.18. Design rationale

    -

    32.18.1. Design goals

    +

    33.18.1. Design goals

    This project was created to help me understand, modify and test low level system components by using system simulators.

    @@ -32625,7 +33296,7 @@ git push --follow-tags
    -

    32.18.2. Setup trade-offs

    +

    33.18.2. Setup trade-offs

    The trade-offs between the different setups are basically a balance between:

    @@ -32650,13 +33321,13 @@ git push --follow-tags

    compatibility: how likely is is that all the components will work well together: emulator, compiler, kernel, standard library, …​

  • -

    guest software availability: how wide is your choice of easily installed guest software packages? See also: Section 32.18.4, “Linux distro choice”

    +

    guest software availability: how wide is your choice of easily installed guest software packages? See also: Section 33.18.4, “Linux distro choice”

  • -

    32.18.3. Resource tradeoff guidelines

    +

    33.18.3. Resource tradeoff guidelines

    Choosing which features go into our default builds means making tradeoffs, here are our guidelines:

    @@ -32701,12 +33372,12 @@ git push --follow-tags
    -

    32.18.4. Linux distro choice

    +

    33.18.4. Linux distro choice

    -

    We haven’t found the ultimate distro yet, here is a summary table of trade-offs that we care about: Table 7, “Comparison of Linux distros for usage in this repository”.

    +

    We haven’t found the ultimate distro yet, here is a summary table of trade-offs that we care about: Table 8, “Comparison of Linux distros for usage in this repository”.

    - +@@ -32804,9 +33475,9 @@ git push --follow-tags
    -

    32.19. Soft topics

    +

    33.19. Soft topics

    -

    32.19.1. Fairy tale

    +

    33.19.1. Fairy tale

    @@ -32843,7 +33514,7 @@ git push --follow-tags
    -

    32.19.2. Should you waste your life with systems programming?

    +

    33.19.2. Should you waste your life with systems programming?

    Being the hardcore person who fully understands an important complex system such as a computer, it does have a nice ring to it doesn’t it?

    @@ -32928,7 +33599,7 @@ git push --follow-tags
    -

    32.20. Bibliography

    +

    33.20. Bibliography

    Runnable stuff:

    @@ -33008,5 +33679,14 @@ git push --follow-tags
    + \ No newline at end of file
    Table 7. Comparison of Linux distros for usage in this repositoryTable 8. Comparison of Linux distros for usage in this repository