diff --git a/bst-vs-heap.tmp.png b/bst-vs-heap.tmp.png new file mode 100644 index 0000000..7ff52da Binary files /dev/null and b/bst-vs-heap.tmp.png differ diff --git a/bst_vs_heap_vs_hashmap.tmp.png b/bst_vs_heap_vs_hashmap.tmp.png new file mode 100644 index 0000000..11b3d5c Binary files /dev/null and b/bst_vs_heap_vs_hashmap.tmp.png differ diff --git a/bst_vs_heap_vs_hashmap_gem5.tmp.png b/bst_vs_heap_vs_hashmap_gem5.tmp.png new file mode 100644 index 0000000..2cc8ab6 Binary files /dev/null and b/bst_vs_heap_vs_hashmap_gem5.tmp.png differ diff --git a/index.html b/index.html index 226ab1b..23eec9a 100644 --- a/index.html +++ b/index.html @@ -1585,7 +1585,12 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 27. Benchmark this repo @@ -6017,7 +6032,7 @@ vim rootfs_overlay/etc/init.d/S99.gitignore

    6.4. Init environment

    @@ -8344,7 +8359,7 @@ xeyes
    -image +x11
    Figure 1. X11 Buildroot graphical user interface screenshot
    @@ -9052,7 +9067,7 @@ CONFIG_IKCONFIG_PROC=y
    -

    Just for fun https://stackoverflow.com/questions/14958192/how-to-get-the-config-from-a-linux-kernel-image/14958263#14958263:

    +

    Just for fun https://stackoverflow.com/questions/14958192/how-to-get-the-config-from-a-linux-kernel-image/14958263#14958263:

    @@ -9267,7 +9282,7 @@ CONFIG_IKCONFIG_PROC=y

    arm and aarch64 configs present in the official ARM gem5 Linux kernel fork: gem5 arm Linux kernel patches. Some of the configs present there are added by the patches.

  • -

    Jason’s magic x86_64 config: http://web.archive.org/web/20171229121642/http://www.lowepower.com/jason/files/config which is referenced at: http://web.archive.org/web/20171229121525/http://www.lowepower.com/jason/setting-up-gem5-full-system.html. QEMU boots with that by removing # CONFIG_VIRTIO_PCI is not set.

    +

    Jason’s magic x86_64 config: http://web.archive.org/web/20171229121642/http://www.lowepower.com/jason/files/config which is referenced at: http://web.archive.org/web/20171229121525/http://www.lowepower.com/jason/setting-up-gem5-full-system.html. QEMU boots with that by removing # CONFIG_VIRTIO_PCI is not set.

  • @@ -14105,7 +14120,7 @@ tty63::respawn:-/bin/sh
  • -

    /dev/ttyN for the other graphic TTYs. Note that there are only 63 available ones, from /dev/tty1 to /dev/tty63 (/dev/tty0 is the current one): https://superuser.com/questions/449781/why-is-there-so-many-linux-dev-tty. I think this is determined by:

    +

    /dev/ttyN for the other graphic TTYs. Note that there are only 63 available ones, from /dev/tty1 to /dev/tty63 (/dev/tty0 is the current one): https://superuser.com/questions/449781/why-is-there-so-many-linux-dev-tty. I think this is determined by:

    #define MAX_NR_CONSOLES 63
    @@ -14630,28 +14645,10 @@ ps

    We are trying to maintain a description of each at: https://unix.stackexchange.com/questions/5518/what-is-the-difference-between-the-following-kernel-makefile-terms-vmlinux-vml/482978#482978

    -

    QEMU does not seem able to boot ELF files like vmlinux, only objdump code: https://superuser.com/questions/1376944/can-qemu-boot-linux-from-vmlinux-instead-of-bzimage

    +

    QEMU does not seem able to boot ELF files like vmlinux: https://superuser.com/questions/1376944/can-qemu-boot-linux-from-vmlinux-instead-of-bzimage

    -

    Converting arch/* images to vmlinux is possible in x86 with extract-vmlinux. But for arm it fails with:

    -
    -
    -
    -
    run-detectors: unable to find an interpreter for
    -
    -
    -
    -

    as mentioned at:

    -
    -
    @@ -16290,7 +16287,7 @@ less "$(./getvar --arch x86_64 run_dir)/trace-lines.txt"

    TODO get working.

    -

    QEMU replays support checkpointing, and this allows for a simplistic "reverse debugging" implementation proposed at https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg00478.html on the unmerged https://github.com/ispras/qemu/tree/rr-180725:

    +

    QEMU replays support checkpointing, and this allows for a simplistic "reverse debugging" implementation proposed at https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg00478.html on the unmerged https://github.com/ispras/qemu/tree/rr-180725:

    @@ -16341,7 +16338,7 @@ reverse-continue

    17.8.6. gem5 tracing

    -

    gem5 provides also provides a tracing mechanism documented at: http://www.gem5.org/Trace_Based_Debugging:

    +

    gem5 provides also provides a tracing mechanism documented at: http://www.gem5.org/Trace_Based_Debugging:

    @@ -16432,7 +16429,7 @@ less "$(./getvar gem5_source_dir)/src/cpu/exetrace.cc"

    full instructions, as the first line. Only shown if the ExecMacro flag is given.

  • -

    micro ops that constitute the instruction, the lines that follow. Yes, aarch64 also has microops: https://superuser.com/questions/934752/do-arm-processors-like-cortex-a9-use-microcode/934755#934755. Only shown if the ExecMicro flag is given.

    +

    micro ops that constitute the instruction, the lines that follow. Yes, aarch64 also has microops: https://superuser.com/questions/934752/do-arm-processors-like-cortex-a9-use-microcode/934755#934755. Only shown if the ExecMicro flag is given.

  • @@ -16577,7 +16574,7 @@ root
    -

    First build Dhrystone into the root filesystem:

    +

    We will do that for various input parameters on full system by taking a checkpoint after the boot finishes a fast atomic CPU boot, and then we will restore in a more detailed mode and run the benchmark:

    -
    ./build-buildroot --config 'BR2_PACKAGE_DHRYSTONE=y'
    +
    ./build-buildroot --config 'BR2_PACKAGE_DHRYSTONE=y'
    +# Boot fast, take checkpoint, and exit.
    +./run --arch aarch64 --emulator gem5 --eval-after './gem5.sh'
    +
    +# Restore the checkpoint after boot, and benchmark with input 1000.
    +./run \
    +  --arch aarch64 \
    +  --emulator gem5 \
    +  --eval-after './gem5.sh' \
    +  --gem5-readfile 'm5 resetstats;dhrystone 1000;m5 dumpstats' \
    +  --gem5-restore 1 \
    +  -- \
    +  --cpu-type=HPI \
    +  --restore-with-cpu=HPI \
    +  --caches \
    +  --l2cache \
    +  --l1d_size=64kB \
    +  --l1i_size=64kB \
    +  --l2_size=256kB \
    +;
    +# Get the value for number of cycles.
    +# head because there are two lines: our dumpstats and the
    +# automatic dumpstats at the end which we don't care about.
    +./gem5-stat --arch aarch64 | head -n 1
    +
    +# Now for input 10000.
    +./run \
    +  --arch aarch64 \
    +  --emulator gem5 \
    +  --eval-after './gem5.sh' \
    +  --gem5-readfile 'm5 resetstats;dhrystone 10000;m5 dumpstats' \
    +  --gem5-restore 1 \
    +  -- \
    +  --cpu-type=HPI \
    +  --restore-with-cpu=HPI \
    +  --caches \
    +  --l2cache \
    +  --l1d_size=64kB \
    +  --l1i_size=64kB \
    +  --l2_size=256kB \
    +;
    +./gem5-stat --arch aarch64 | head -n 1
    -

    Then, a flexible setup is demonstrated at:

    +

    If you ever need a shell to quickly inspect the system state after boot, you can just use:

    +
    +
    +
    +
    ./run \
    +  --arch aarch64 \
    +  --emulator gem5 \
    +  --eval-after './gem5.sh' \
    +  --gem5-readfile 'sh' \
    +  --gem5-restore 1 \
    +
    +
    +
    +

    This procedure is further automated and DRYed up at:

    @@ -16669,14 +16720,14 @@ cat out/gem5-bench-dhrystone.txt

    Source: gem5-bench-dhrystone

    -

    Sample output:

    +

    Output at 2438410c25e200d9766c8c65773ee7469b599e4a + 1:

    n cycles
    -1000 12898577
    -10000 23441629
    -100000 128428617
    +1000 13665219 +10000 20559002 +100000 85977065
    @@ -16686,9 +16737,6 @@ cat out/gem5-bench-dhrystone.txt

    The gem5-stats commands output the approximate number of CPU cycles it took Dhrystone to run.

    -

    Another interesting example can be found at: gem5-bench-cache.

    -
    -

    A more naive and simpler to understand approach would be a direct:

    @@ -16697,7 +16745,7 @@ cat out/gem5-bench-dhrystone.txt
    -

    but the problem is that this method does not allow to easily run a different script without running the boot again, see: gem5 checkpoint restore and run a different script.

    +

    but the problem is that this method does not allow to easily run a different script without running the boot again. The ./gem5.sh script works around that by using m5 readfile as explained further at: gem5 checkpoint restore and run a different script.

    Now you can play a fun little game with your friends:

    @@ -16868,7 +16916,7 @@ getconf _NPROCESSORS_CONF
    -

    Cache sizes can in theory be checked with the methods described at: https://superuser.com/questions/55776/finding-l2-cache-size-in-linux:

    +

    Cache sizes can in theory be checked with the methods described at: https://superuser.com/questions/55776/finding-l2-cache-size-in-linux:

    @@ -17788,24 +17836,24 @@ m5 checkpoint
    # Boot, checkpoint and exit.
    -printf 'echo "setup run";m5 exit' > "$(./getvar gem5_readfile)"
    +printf 'echo "setup run";m5 exit' > "$(./getvar gem5_readfile_file)"
     ./run --emulator gem5 --eval 'm5 checkpoint;m5 readfile > a.sh;sh a.sh'
     
     # Restore and run the first benchmark.
    -printf 'echo "first benchmark";m5 exit' > "$(./getvar gem5_readfile)"
    +printf 'echo "first benchmark";m5 exit' > "$(./getvar gem5_readfile_file)"
     ./run --emulator gem5 --gem5-restore 1
     
     # Restore and run the second benchmark.
    -printf 'echo "second benchmark";m5 exit' > "$(./getvar gem5_readfile)"
    +printf 'echo "second benchmark";m5 exit' > "$(./getvar gem5_readfile_file)"
     ./run --emulator gem5 --gem5-restore 1
     
     # If something weird happened, create an interactive shell to examine the system.
    -printf 'sh' > "$(./getvar gem5_readfile)"
    +printf 'sh' > "$(./getvar gem5_readfile_file)"
     ./run --emulator gem5 --gem5-restore 1
    -

    Since this is such a common setup, we provide some helpers for it as described at gem5 run benchmark:

    +

    Since this is such a common setup, we provide the following helpers for this operation:

      @@ -17818,6 +17866,9 @@ printf 'sh' > "$(./getvar gem5_readfile)"
    +

    Their usage us exemplified at gem5 run benchmark.

    +
    +

    Other loophole possibilities include:

    @@ -18094,7 +18145,7 @@ m5 writefile myfileguest myfilehost
    -
    date > "$(./getvar gem5_readfile)"
    +
    date > "$(./getvar gem5_readfile_file)"
    @@ -18147,7 +18198,7 @@ m5 writefile myfileguest myfilehost
    printf '#!/bin/sh
     echo asdf
    -' > "$(./getvar gem5_readfile)"
    +' > "$(./getvar gem5_readfile_file)"
    @@ -18386,7 +18437,7 @@ m5_fail(ints[1], ints[0]);

    The patches also add defconfigs that are known to work well with gem5.

    In order to use those patches and their associated configs, and, we recommend using Linux kernel build variants as:

    @@ -19121,7 +19172,7 @@ make menuconfig

    Once you’ve built a package in to the image, there is no easy way to remove it.

    Also mentioned at: https://stackoverflow.com/questions/47320800/how-to-clean-only-target-in-buildroot

    @@ -19172,7 +19223,7 @@ make menuconfig TODO benchmark: would gem5 suffer a considerable disk read performance hit due to decompressing SquashFS?

  • -

    libguestfs: https://serverfault.com/questions/246835/convert-directory-to-qemu-kvm-virtual-disk-image/916697#916697, in particular vfs-minimum-size

    +

    libguestfs: https://serverfault.com/questions/246835/convert-directory-to-qemu-kvm-virtual-disk-image/916697#916697, in particular vfs-minimum-size

  • use methods described at: gem5 checkpoint restore and run a different script instead of putting builds on the root filesystem

    @@ -24108,7 +24159,7 @@ AArch64, see Procedure Call Standard for the ARM 64-bit Architecture.

    23.8.2. ARM official bibliography

    -

    The official manuals were stored in http://infocenter.arm.com but as of 2017 they started to slowly move to https://developer.arm.com.

    +

    The official manuals were stored in http://infocenter.arm.com but as of 2017 they started to slowly move to https://developer.arm.com.

    Each revision of a document has a "ARM DDI" unique document identifier.

    @@ -25266,10 +25317,10 @@ IN:
    -

    To wake up CPU 1 on QEMU, we must use the Power State Coordination Interface (PSCI) which is documented at: https://developer.arm.com/docs/den0022/latest/arm-power-state-coordination-interface-platform-design-document.

    +

    To wake up CPU 1 on QEMU, we must use the Power State Coordination Interface (PSCI) which is documented at: https://developer.arm.com/docs/den0022/latest/arm-power-state-coordination-interface-platform-design-document.

    -

    This interface uses HVC calls, and the calling convention is documented at "SMC CALLING CONVENTION" https://developer.arm.com/docs/den0028/latest.

    +

    This interface uses HVC calls, and the calling convention is documented at "SMC CALLING CONVENTION" https://developer.arm.com/docs/den0028/latest.

    If we boot the Linux kernel on QEMU and dump the auto-generated device tree, we observe that it contains the address of the PSCI CPU_ON call:

    @@ -26038,11 +26089,48 @@ cd -
  • -

    27.1. Travis

    +

    27.1. Continuous integraion

    +
    +

    We have exploreed a few Continuous integration solutions.

    +
    +
    +

    We haven’t setup any of them yet.

    +
    +
    +

    27.1.1. Travis

    We tried to automate it on Travis with .travis.yml but it hits the current 50 minute job timeout: https://travis-ci.org/cirosantilli/linux-kernel-module-cheat/builds/296454523 And I bet it would likely hit a disk maxout either way if it went on.

    +
    +

    27.1.2. CircleCI

    +
    +

    This setup sucessfully built gem5 on every commit: .circleci/config.yml

    +
    +
    +

    Enabling it is however blocked on: https://github.com/cirosantilli/linux-kernel-module-cheat/issues/79 so we disabled the builds on the web UI.

    +
    +
    +

    If that ever gets done, we will also need to:

    +
    +
    + +
    +
    +

    A build took about 1 hour of a core, and the free tier allows for 1000 minutes per month: https://circleci.com/pricing/ so about 17 hours. The cheapest non-free setup seems to be 50 dollars per month gets us infinite build minutes per month and 2 containers, so we could scale things to run in under 24 hours.

    +
    +
    +

    There is no result reporting web UI however…​ but neither does GitLab CI: https://gitlab.com/gitlab-org/gitlab-ce/issues/17081

    +
    +
    +

    27.2. Benchmark this repo benchmarks

    @@ -26437,7 +26525,7 @@ tail -n+1 ../linux-kernel-module-cheat-regression/*/gem5-bench-build-*.txt
    -

    This is specially true for gem5, which runs much slower than QEMU, and cannot use multiple host cores to speed up the simulation: https://github.com/cirosantilli-work/gem5-issues/issues/15, so the only way to parallelize is to run multiple instances in parallel.

    +

    This is specially true for gem5, which runs much slower than QEMU, and cannot use multiple host cores to speed up the simulation: https://github.com/cirosantilli-work/gem5-issues/issues/15, so the only way to parallelize is to run multiple instances in parallel.

    This also has a good synergy with Build variants.

    @@ -27917,7 +28005,7 @@ echo $?
  • the exit() baremetal function when status != 1.

    -

    Unfortunately the only way we found to set this up was with on_exit: https://github.com/cirosantilli/linux-kernel-module-cheat/issues/59.

    +

    Unfortunately the only way we found to set this up was with on_exit: https://github.com/cirosantilli/linux-kernel-module-cheat/issues/59.

    Trying to patch _exit directly fails since at that point some de-initialization has already happened which prevents the print.