From 838ca98a6cfb8a35d32fbb071ea6bd6813cc780e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ciro=20Santilli=20=E5=85=AD=E5=9B=9B=E4=BA=8B=E4=BB=B6=20?= =?UTF-8?q?=E6=B3=95=E8=BD=AE=E5=8A=9F?= Date: Thu, 12 Dec 2019 00:00:00 +0000 Subject: [PATCH] ed5fa984c6226f81cb1a07f980d319ee9ee88e00 --- index.html | 283 +++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 255 insertions(+), 28 deletions(-) diff --git a/index.html b/index.html index 9f05d46..a180df3 100644 --- a/index.html +++ b/index.html @@ -870,6 +870,7 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 15.7.2. Kernel oops
  • 15.7.3. dump_stack
  • 15.7.4. WARN_ON
  • +
  • 15.7.5. not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
  • 15.8. Pseudo filesystems @@ -1174,10 +1175,14 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 19.17. gem5 CPU types
  • 19.18. gem5 ARM platforms
  • @@ -1314,7 +1319,9 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 21.6. Interpreted languages
  • 21.7. Algorithms @@ -1717,8 +1724,9 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 27.8.3. ARM multicore
  • 27.8.4. ARM timer
  • @@ -11594,6 +11602,53 @@ insmod warn_on.ko

    Can also be activated with the panic_on_warn boot parameter.

    +
    +

    15.7.5. not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

    +
    +

    Let’s learn how to diagnose problems with the root filesystem not being found. TODO add a sample panic error message for each error type:

    +
    + +
    +

    This is the diagnosis procedure:

    +
    +
    +
      +
    • +

      does the filesystem appear on the list of filesystems? If not, then likely you are missing either:

      +
      +
        +
      • +

        the driver for that hardware type, e.g. hard drive / SSD type. Here, Linux does not know how to communicate with a given hardware to get bytes from it at all. In simiulation, the most important often missing one is virtio which needs:

        +
        +
        +
        CONFIG_VIRTIO_PCI=y
        +CONFIG_VIRTIO_BLK=y
        +
        +
        +
      • +
      • +

        the driver for that filesystem type. Here, Linux can read bytes from the hardware, but cannot interpret them as a tree of files because it does not recognize the file system format. For example, to boot from SquashFS we would need:

        +
        +
        +
        CONFIG_SQUASHFS=y
        +
        +
        +
      • +
      +
      +
    • +
    • +

      your filesystem of interest appears in the list, then you just need to set the root command line parameter to point to that, e.g. root=/dev/sda

      +
    • +
    +
    +

    15.8. Pseudo filesystems

    @@ -19674,11 +19729,20 @@ clock=500
    -
    ./gem5-regression --arch aarch64 -- --length quick
    +
    ./build-gem5 --arch aarch64
    +./gem5-regression --arch aarch64 -- --length quick --length long
    -

    TODO skip the build by default with --skip-build since we already manage it with ./build-gem5. But we can’t do this because it is the build step that downloads the test binaries. We need to find a way to either download the binaries without building, or to pass the exact same scons build options through test/main.py.

    +

    After the first run has downloaded the test binaries for you, you can speed up the process a little bit by skipping an useless scons call:

    +
    +
    +
    +
    ./gem5-regression --arch aarch64 -- --length quick --length long --skip-build
    +
    +
    +
    +

    Note however that --skip-build is required at least once per branch to download the test binaries, because the test interface is bad.

    @@ -19977,10 +20041,31 @@ Indirect leak of 1346 byte(s) in 2 object(s) allocated from:

    In fs.py and se.py, those are selectable with the --cpu-type option.

    -

    TODO are there any public performance correlations between those models and real cores? The information to make accurate models isn’t generally public for non-free CPUs, so either you must either rely vendor provided models or on experiments/reverse engineering.

    +

    The information to make highly accurate models isn’t generally public for non-free CPUs, so either you must either rely vendor provided models or on experiments/reverse engineering.

    +
    +
    +

    There is no simple answer for "what is the best CPU", in theory you have to understand each model and decide which one is closer your target system.

    +
    +
    +

    Whenever possible, stick to:

    +
    +
    + +
    +
    +

    Both of those can be checked with git log and git blame.

    -

    19.17.1. gem5 BaseSimpleCPU

    +

    19.17.1. List gem5 CPU types

    +
    +
    19.17.1.1. gem5 BaseSimpleCPU

    Simple abstract CPU without a pipeline.

    @@ -20016,8 +20101,8 @@ Indirect leak of 1346 byte(s) in 2 object(s) allocated from:

    KVM CPUs are an alternative way of fast forwarding boot when they work.

    -
    -

    19.17.2. gem5 MinorCPU

    +
    +
    19.17.1.2. gem5 MinorCPU

    Generic in-order core that does not model any specific CPU.

    @@ -20073,11 +20158,20 @@ Indirect leak of 1346 byte(s) in 2 object(s) allocated from:

    Implemented by Pierre-Yves Péneau from LIRMM, which is a research lab in Montpellier, France, in 2017.

    +
  • +

    O3_ARM_v7a: implemented by Ronald Dreslinski from the University of Michigan in 2012

    +
    +

    Not sure why it has v7a in the name, since I believe the CPUs are just the microarchitectural implementation of any ISA, and the v8 hello world did run.

    +
    +
    +

    The CLI option is named slightly differently as: --cpu-type O3_ARM_v7a_3.

    +
    +
  • -
    -

    19.17.3. gem5 DeriveO3CPU

    +
    +
    19.17.1.3. gem5 DeriveO3CPU

    Generic out-of-order core. "O3" Stands for "Out Of Order"!

    @@ -20102,8 +20196,9 @@ Indirect leak of 1346 byte(s) in 2 object(s) allocated from:
    +
    -

    19.17.4. gem5 ARM RSK

    +

    19.17.2. gem5 ARM RSK

    @@ -21475,7 +21570,7 @@ build/ARM/config/the_isa.hh

    the first build takes a while, but it is well worth it

  • -

    the selection of software packages is relatively limited if compared to Debian, e.g. no Java or Python package in guest out of the box.

    +

    the selection of software packages is relatively limited if compared to Debian.

    In theory, any software can be packaged, and the Buildroot side is easy.

    @@ -23193,15 +23288,98 @@ There are no non-locking atomic types or atomic primitives in POSIX:

    21.6. Interpreted languages

    -

    Maybe some day someone will use this setup to study the performance of interpreters:

    +

    Maybe some day someone will use this setup to study the performance of interpreters.

    -

    21.6.1. Node.js

    +

    21.6.1. Python

    -

    Parent section: Interpreted languages.

    +

    Build and install the interpreter on the target:

    +
    +
    +
    +
    ./build-buildroot --config 'BR2_PACKAGE_PYTHON3=y'
    +
    -

    Install the interpreter with:

    +

    Usage from guest in full system:

    +
    +
    +
    +
    ./run
    +
    +
    +
    +

    and then from there get an interactive shell with:

    +
    +
    +
    +
    python3
    +
    +
    +
    +

    or run an example with:

    +
    +
    +
    +
    python3 lkmc/python/hello.py
    +
    +
    +
    +

    User mode simulation interactive usage:

    +
    +
    +
    +
    ./run --userland "$(./getvar buildroot_target_dir)/usr/bin/python3"
    +
    +
    +
    +

    Non-interactive usage:

    +
    +
    +
    +
    ./run --userland "$(./getvar buildroot_target_dir)/usr/bin/python3" --userland-args rootfs_overlay/lkmc/python/hello.py
    +
    +
    +
    +

    LKMC 50ac89b779363774325c81157ec8b9a6bdb50a2f gem5 390a74f59934b85d91489f8a563450d8321b602da arch64:

    +
    +
    +
    +
    ./run \
    +  --arch aarch64 \
    +  --emulator gem5 \
    +  --userland "$(./getvar \
    +  --arch aarch64 buildroot_target_dir)/usr/bin/python3" \
    +  --userland-args rootfs_overlay/lkmc/python/hello.py \
    +;
    +
    +
    +
    +

    fails with:

    +
    +
    +
    +
    fatal: syscall unused#278 (#278) unimplemented.
    +
    +
    +
    +

    which corresponds to the glorious getrandom syscall: https://github.com/torvalds/linux/blob/v4.17/include/uapi/asm-generic/unistd.h#L707

    +
    +
    +

    Examples:

    +
    +
    + +
    +
    +
    +

    21.6.2. Node.js

    +
    +

    Build and install the interpreter with:

    @@ -23209,7 +23387,10 @@ There are no non-locking atomic types or atomic primitives in POSIX: -

    TODO: broken as of 3c3deb14dc8d6511680595dc42cb627d5781746d + 1:

    +

    Everything is then the same as the Python interpreter setup, except that the executable name is now node!

    +
    +
    +

    TODO: build broken as of LKMC 3c3deb14dc8d6511680595dc42cb627d5781746d + 1:

    @@ -23222,6 +23403,9 @@ There are no non-locking atomic types or atomic primitives in POSIX:

    21.7. Algorithms

    @@ -28793,12 +28992,15 @@ AArch64, see Procedure Call Standard for the ARM 64-bit Architecture.

    24.7.1.1. ARM Large System Extensions (LSE)
    +

    Parent section: ARM multicore.

    +
    +

    ARMv8 architecture reference manual db "ARMv8.1-LSE, ARMv8.1 Large System Extensions"

    @@ -30279,7 +30481,7 @@ IN: main

    and power consumption is key in ARM applications.

    -

    SEV is not the only thing that can wake up a WFE, it is only an explicit software way to do it. Notably, global monitor operations on memory accesses of regions marked by LDREX and STREX instructions can also wake up a WFE sleeping core. This is done to allow spinlocks opens to automatically wake up WFE sleeping cores at free time without the need for a explicit SEV.

    +

    SEV is not the only thing that can wake up a WFE, it is only an explicit software way to do it. Notably, global monitor operations on memory accesses of regions marked by LDAXR and STLXR instructions can also wake up a WFE sleeping core. This is done to allow spinlocks opens to automatically wake up WFE sleeping cores at free time without the need for a explicit SEV.

    WFE and SEV are usable from userland, and are part of a efficient spinlock implementation.

    @@ -30318,7 +30520,7 @@ IN: main
    -

    The recommended ARMv8 spinlock implementation is shown at http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s03s02.html where WAIT_FOR_UPDATE is as explained in that section a macro that expands to WFE.

    +

    The recommended ARMv8 spinlock implementation is shown at http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s03s02.html where WAIT_FOR_UPDATE is as explained in that section a macro that expands to WFE. TODO SEV is used explicitly in those examples via SIGNAL_UPDATE, where is the example that shows how SEV can be eliminated due to implicit monitor signals?

    In QEMU 3.0.0, SEV is a NOPs, and WFE might be, but I’m not sure, see: https://github.com/qemu/qemu/blob/v3.0.0/target/arm/translate-a64.c#L1423

    @@ -30372,9 +30574,31 @@ IN: main
  • +
    +

    The best article to understand spinlocks is: https://eli.thegreenplace.net/2018/basics-of-futexes/

    +
    -
    27.8.3.2. ARM PSCI
    +
    27.8.3.2. ARM LDAXR and STLXR instructions
    +
    +

    Can be used to implement atomic variables, see also:

    +
    + +
    +

    The ARMv7 analogues are LDREX and STREX.

    +
    +
    +
    +
    27.8.3.3. ARM PSCI

    In QEMU, CPU 1 starts in a halted state. This can be observed from GDB, where:

    @@ -30424,7 +30648,7 @@ IN: main
    -
    27.8.3.3. ARM DMB instruction
    +
    27.8.3.4. ARM DMB instruction

    TODO: create and study a minimal examples in gem5 where the DMB instruction leads to less cycles: https://stackoverflow.com/questions/15491751/real-life-use-cases-of-barriers-dsb-dmb-isb-in-arm

    @@ -30974,7 +31198,7 @@ ISB

    28. Android

    -

    Remember: Android AOSP is a huge undocumented piece of bloatware. It’s integration into this repo will likely never be super good.

    +

    Remember: Android AOSP is a huge undocumented piece of bloatware. It’s integration into this repo will likely never be super good. See also: https://cirosantilli.com#android

    Verbose setup description: https://stackoverflow.com/questions/1809774/how-to-compile-the-android-aosp-kernel-and-test-it-with-the-android-emulator/48310014#48310014

    @@ -34826,6 +35050,9 @@ git push --follow-tags
  • https://github.com/MichielDerhaeg/build-linux untested. Manually builds musl and BusyBox, no Buildroot. Seems to use host packaged toolchain and tested on x86_64 only. Might contain a minimized kernel config.

  • +
  • +

    https://eli.thegreenplace.net and the accompanying code: https://github.com/eliben/code-for-blog

    +