diff --git a/index.html b/index.html index abc34b7..baa3d59 100644 --- a/index.html +++ b/index.html @@ -685,6 +685,7 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 10.6.1. gem5 syscall emulation exit status
  • 10.6.2. gem5 syscall emulation mode program stdin
  • 10.6.3. User mode vs full system benchmark
  • +
  • 10.6.4. gem5 syscall emulation mode syscall tracing
  • 10.7. QEMU user mode quirks @@ -1172,17 +1173,20 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 18.15.1. gem5 debug build
  • 18.15.2. gem5 clang build
  • 18.15.3. gem5 sanitation build
  • +
  • 18.15.4. gem5 Ruby build
  • + + +
  • 18.16. gem5 ARM platforms
  • +
  • 18.17. gem5 internals +
  • 19. Buildroot
  • 21. Userland assembly @@ -3674,6 +3679,9 @@ echo "$(./getvar --arch aarch64 --emulator gem5 image)"
    +

    see also: Section 18.16, “gem5 ARM platforms”.

    +
    +

    This generates yet new separate images with new magic constants:

    @@ -3716,7 +3724,7 @@ echo "$(./getvar --arch aarch64 --baremetal userland/c/hello.c --emulator gem5 -
    -
    asciidotor README.adoc
    +
    asciidoctor README.adoc
     xdg-open README.html
    @@ -7597,6 +7605,53 @@ time \ +
    +

    10.6.4. gem5 syscall emulation mode syscall tracing

    +
    +

    Since gem5 has to implement syscalls itself in syscall emulation mode, it can of course clearly see which syscalls are bing made, and we can log them for debug purposes with gem5 tracing, e.g.:

    +
    +
    +
    +
    ./run \
    +  --emulator gem5 \
    +  --static userland/arch/x86_64/freestanding/linux/hello.S \
    +  --userland  \
    +  --trace-stdout \
    +  --trace ExecAll,SyscallBase,SyscallVerbose \
    +;
    +
    +
    +
    +

    the trace as of f2eeceb1cde13a5ff740727526bf916b356cee38 + 1 contains:

    +
    +
    +
    +
          0: system.cpu A0 T0 : @asm_main_after_prologue    : mov   rdi, 0x1
    +      0: system.cpu A0 T0 : @asm_main_after_prologue.0  :   MOV_R_I : limm   rax, 0x1 : IntAlu :  D=0x0000000000000001  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   1000: system.cpu A0 T0 : @asm_main_after_prologue+7    : mov rdi, 0x1
    +   1000: system.cpu A0 T0 : @asm_main_after_prologue+7.0  :   MOV_R_I : limm   rdi, 0x1 : IntAlu :  D=0x0000000000000001  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   2000: system.cpu A0 T0 : @asm_main_after_prologue+14    : lea        rsi, DS:[rip + 0x19]
    +   2000: system.cpu A0 T0 : @asm_main_after_prologue+14.0  :   LEA_R_P : rdip   t7, %ctrl153,  : IntAlu :  D=0x000000000040008d  flags=(IsInteger|IsMicroop|IsDelayedCommit|IsFirstMicroop)
    +   2500: system.cpu A0 T0 : @asm_main_after_prologue+14.1  :   LEA_R_P : lea   rsi, DS:[t7 + 0x19] : IntAlu :  D=0x00000000004000a6  flags=(IsInteger|IsMicroop|IsLastMicroop)
    +   3500: system.cpu A0 T0 : @asm_main_after_prologue+21    : mov        rdi, 0x6
    +   3500: system.cpu A0 T0 : @asm_main_after_prologue+21.0  :   MOV_R_I : limm   rdx, 0x6 : IntAlu :  D=0x0000000000000006  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   4000: system.cpu: T0 : syscall write called w/arguments 1, 4194470, 6, 0, 0, 0
    +hello
    +   4000: system.cpu: T0 : syscall write returns 6
    +   4000: system.cpu A0 T0 : @asm_main_after_prologue+28    :   syscall    eax           : IntAlu :   flags=(IsInteger|IsSerializeAfter|IsNonSpeculative|IsSyscall)
    +   5000: system.cpu A0 T0 : @asm_main_after_prologue+30    : mov        rdi, 0x3c
    +   5000: system.cpu A0 T0 : @asm_main_after_prologue+30.0  :   MOV_R_I : limm   rax, 0x3c : IntAlu :  D=0x000000000000003c  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   6000: system.cpu A0 T0 : @asm_main_after_prologue+37    : mov        rdi, 0
    +   6000: system.cpu A0 T0 : @asm_main_after_prologue+37.0  :   MOV_R_I : limm   rdi, 0  : IntAlu :  D=0x0000000000000000  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   6500: system.cpu: T0 : syscall exit called w/arguments 0, 4194470, 6, 0, 0, 0
    +   6500: system.cpu: T0 : syscall exit returns 0
    +   6500: system.cpu A0 T0 : @asm_main_after_prologue+44    :   syscall    eax           : IntAlu :   flags=(IsInteger|IsSerializeAfter|IsNonSpeculative|IsSyscall)
    +
    +
    +
    +

    so we see that two syscall lines were added for each syscall, showing the syscall inputs and exit status, just like a mini strace!

    +
    +

    10.7. QEMU user mode quirks

    @@ -16865,6 +16920,9 @@ less "$(./getvar gem5_source_dir)/src/cpu/exetrace.cc"
  • Registers

  • +
  • +

    SyscallBase, SyscallVerbose

    +
  • @@ -17269,7 +17327,7 @@ root
    +
    +

    18.15.4. gem5 Ruby build

    +
    +

    Ruby is a system that includes the SLICC domain specific language to describe memory systems: http://gem5.org/Ruby

    +
    +
    +

    It seems to have usage outside of gem5, but the naming overload with the Ruby programming language, which also has domain specific languages as a concept, makes it impossible to google anything about it!

    +
    +
    +

    Ruby is activated at compile time with the PROTOCOL flag, which specifies the desired memory system time.

    +
    +
    +

    For example, to use a two level MESI cache coherence protocol, we can do:

    +
    +
    +
    +
    ./build-gem5 --arch aarch64 --gem5-build-id ruby -- PROTOCOL=MESI_Two_Level
    +
    +
    +
    +

    and during build we see a humongous line of type:

    +
    +
    +
    +
    [   SLICC] src/mem/protocol/MESI_Two_Level.slicc -> ARM/mem/protocol/AccessPermission.cc, ARM/mem/protocol/AccessPermission.hh, ...
    +
    +
    +
    +

    which shows that dozens of C++ files are being generated from Ruby SLICC.

    +
    +
    +

    TODO observe it doing something during a run.

    +
    +
    +

    The relevant source files live in the source tree under:

    +
    +
    +
    +
    src/mem/protocol/MESI_Two_Level*
    +
    +
    +
    +

    We already pass the SLICC_HTML flag by default to the build, which generates an HTML summary of each memory protocol under:

    +
    +
    +
    +
    xdg-open "$(./getvar --arch aarch64 --gem5-build-id ruby gem5_build_build_dir)/ARM/mem/protocol/html/index.html"
    +
    +
    +
    +

    A minimized ruby config which was not merged upstream can be found for study at: https://gem5-review.googlesource.com/c/public/gem5/+/13599/1

    +
    +
    + +
    +

    18.16. gem5 ARM platforms

    +
    +

    The gem5 platform is selectable with the --machine option, which is named after the analogous QEMU -machine option, and which sets the --machine-type.

    +
    +
    +

    Each platform represents a different system with different devices, memory and interrupt setup.

    +
    +
    +

    TODO: describe the main characteristics of each platform, as of gem5 5e83d703522a71ec4f3eb61a01acd8c53f6f3860:

    +
    +
    + +
    +
    +
    +

    18.17. gem5 internals

    +
    +

    18.17.1. gem5 Python C++ interaction

    +
    +

    The interaction uses the Python C extension interface https://docs.python.org/2/extending/extending.html interface through the pybind11 helper library: https://github.com/pybind/pybind11

    +
    +
    +

    The C++ executable both:

    +
    +
    +
      +
    • +

      starts running the Python executable

      +
    • +
    • +

      provides Python classes written in C++ for that Python code to use

      +
    • +
    +
    +
    +

    An example of this can be found at:

    +
    + +
    +

    then gem5 magic simobject class adds some crazy stuff on top of it further…​ is is a mess. in particular, it auto generates params/ headers. TODO: why is this mess needed at all? pybind11 seems to handle constructor arguments just fine:

    +
    + +
    +

    Let’s study BadDevice for example:

    +
    +
    +

    src/dev/BadDevice.py defines devicename:

    +
    +
    +
    +
    class BadDevice(BasicPioDevice):
    +    type = 'BadDevice'
    +    cxx_header = "dev/baddev.hh"
    +    devicename = Param.String("Name of device to error on")
    +
    +
    +
    +

    The object is created in Python for example from src/dev/alpha/Tsunami.py as:

    +
    +
    +
    +
        fb = BadDevice(pio_addr=0x801fc0003d0, devicename='FrameBuffer')
    +
    +
    +
    +

    Since BadDevice has no __init__ method, and neither BasicPioDevice, it all just falls through until the SimObject.init constructor.

    +
    +
    +

    This constructor will loop through the inheritance chain and give the Python parameters to the C++ BadDeviceParams class as follows.

    +
    +
    +

    The auto-generated build/ARM/params/BadDevice.hh file defines BadDeviceParams in C++:

    +
    +
    +
    +
    #ifndef __PARAMS__BadDevice__
    +#define __PARAMS__BadDevice__
    +
    +class BadDevice;
    +
    +#include <cstddef>
    +#include <string>
    +
    +#include "params/BasicPioDevice.hh"
    +
    +struct BadDeviceParams
    +    : public BasicPioDeviceParams
    +{
    +    BadDevice * create();
    +    std::string devicename;
    +};
    +
    +#endif // __PARAMS__BadDevice__
    +
    +
    +
    +

    and ./python/_m5/param_BadDevice.cc defines the param Python from C++ with pybind11:

    +
    +
    +
    +
    namespace py = pybind11;
    +
    +static void
    +module_init(py::module &m_internal)
    +{
    +    py::module m = m_internal.def_submodule("param_BadDevice");
    +    py::class_<BadDeviceParams, BasicPioDeviceParams, std::unique_ptr<BadDeviceParams, py::nodelete>>(m, "BadDeviceParams")
    +        .def(py::init<>())
    +        .def("create", &BadDeviceParams::create)
    +        .def_readwrite("devicename", &BadDeviceParams::devicename)
    +        ;
    +
    +    py::class_<BadDevice, BasicPioDevice, std::unique_ptr<BadDevice, py::nodelete>>(m, "BadDevice")
    +        ;
    +
    +}
    +
    +static EmbeddedPyBind embed_obj("BadDevice", module_init, "BasicPioDevice");
    +
    +
    +
    +

    src/dev/baddev.hh then uses the parameters on the constructor:

    +
    +
    +
    +
    class BadDevice : public BasicPioDevice
    +{
    +  private:
    +    std::string devname;
    +
    +  public:
    +    typedef BadDeviceParams Params;
    +
    +  protected:
    +    const Params *
    +    params() const
    +    {
    +        return dynamic_cast<const Params *>(_params);
    +    }
    +
    +  public:
    +     /**
    +      * Constructor for the Baddev Class.
    +      * @param p object parameters
    +      * @param a base address of the write
    +      */
    +    BadDevice(Params *p);
    +
    +
    +
    +

    src/dev/baddev.cc then uses the parameter:

    +
    +
    +
    +
    BadDevice::BadDevice(Params *p)
    +    : BasicPioDevice(p, 0x10), devname(p->devicename)
    +{
    +}
    +
    +
    +
    +

    Tested on gem5 08c79a194d1a3430801c04f37d13216cc9ec1da3.

    +
    +
    @@ -19916,59 +20224,6 @@ qemu-system-aarch64 -M virt -cpu cortex-a57 -nographic -smp 1 -kernel output/ima -
    -

    19.1.1. gem5 Ruby build

    -
    -

    Ruby is a system that includes the SLICC domain specific language to describe memory systems: http://gem5.org/Ruby

    -
    -
    -

    It seems to have usage outside of gem5, but the naming overload with the Ruby programming language, which also has domain specific languages as a concept, makes it impossible to google anything about it!

    -
    -
    -

    Ruby is activated at compile time with the PROTOCOL flag, which specifies the desired memory system time.

    -
    -
    -

    For example, to use a two level MESI cache coherence protocol, we can do:

    -
    -
    -
    -
    ./build-gem5 --arch aarch64 --gem5-build-id ruby -- PROTOCOL=MESI_Two_Level
    -
    -
    -
    -

    and during build we see a humongous line of type:

    -
    -
    -
    -
    [   SLICC] src/mem/protocol/MESI_Two_Level.slicc -> ARM/mem/protocol/AccessPermission.cc, ARM/mem/protocol/AccessPermission.hh, ...
    -
    -
    -
    -

    which shows that dozens of C++ files are being generated from Ruby SLICC.

    -
    -
    -

    TODO observe it doing something during a run.

    -
    -
    -

    The relevant source files live in the source tree under:

    -
    -
    -
    -
    src/mem/protocol/MESI_Two_Level*
    -
    -
    -
    -

    We already pass the SLICC_HTML flag by default to the build, which generates an HTML summary of each memory protocol under:

    -
    -
    -
    -
    xdg-open "$(./getvar --arch aarch64 --gem5-build-id ruby gem5_build_build_dir)/ARM/mem/protocol/html/index.html"
    -
    -
    -
    -

    A minimized ruby config which was not merged upstream can be found for study at: https://gem5-review.googlesource.com/c/public/gem5/+/13599/1

    -
    -

    19.2. Custom Buildroot configs

    @@ -20610,6 +20865,9 @@ git -C "$(./getvar qemu_source_dir)" checkout -
    +

    malloc returns NULL, and mmap goes a bit further and segfauls on the first assignment array[0] = 1.

    +
    +

    Bibliography: https://stackoverflow.com/questions/2798330/maximum-memory-which-malloc-can-allocate

    @@ -20902,6 +21160,16 @@ git -C "$(./getvar qemu_source_dir)" checkout - +
    +

    20.5. Userland content bibliography

    +
    + +
    +
    @@ -25306,7 +25574,25 @@ AArch64, see Procedure Call Standard for the ARM 64-bit Architecture.

    -

    Notice how Sn is very different between v7 and v8! In v7 it goes across Dn, and in v8 inside each Dn.

    +

    Notice how Sn is very different between v7 ARM VFP registers and v8! In v7 it goes across Dn, and in v8 inside each Dn:

    +
    +
    +
    +
    128                         64                  32      16  8   0
    ++---------------------------+-------------------+-------+---+---+
    +|                           Vn                                  |
    ++---------------------------------------------------------------+
    +|                           Qn                                  |
    ++---------------------------+-----------------------------------+
    +                            |                   Dn              |
    +                            +-----------------------------------+
    +                                                |       Sn      |
    +                                                +---------------+
    +                                                        |   Hn  |
    +                                                        +-------+
    +                                                            |Bn |
    +                                                            +---+
    +
    23.6.3.1. ARMv8 aarch64 add vector instruction
    @@ -25852,18 +26138,37 @@ AArch64, see Procedure Call Standard for the ARM 64-bit Architecture.

    It is documented at: https://developer.arm.com/docs/100863/latest/introduction

    -

    For example, the following code makes QEMU exit:

    +

    For example, all the following code make QEMU exit:

    -
    ./run --arch arm --baremetal baremetal/arch/arm/semihost_exit.S
    +
    ./run --arch arm --baremetal baremetal/arch/arm/semihost_exit.S
    +./run --arch arm --baremetal baremetal/arch/arm/no_bootloader/semihost_exit.S
    +./run --arch aarch64 --baremetal baremetal/arch/aarch64/semihost_exit.S
    +./run --arch aarch64 --baremetal baremetal/arch/aarch64/no_bootloader/semihost_exit.S
    -

    Source: baremetal/arch/arm/no_bootloader/semihost_exit.S

    +

    Sources:

    +
    +
    +
    -

    That program program contains the code:

    +

    That arm program program contains the code:

    @@ -26843,12 +27148,20 @@ IN: main

    26.8.3. ARM multicore

    +
    +

    Examples:

    +
    -
    ./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.S --cpus 2
    -./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.S --cpus 2 --emulator gem5
    -./run --arch arm --baremetal baremetal/arch/aarch64/multicore.S --cpus 2
    -./run --arch arm --baremetal baremetal/arch/aarch64/multicore.S --cpus 2 --emulator gem5
    +
    ./run --arch aarch64 --baremetal baremetal/arch/aarch64/no_bootloader/multicore_asm.S --cpus 2
    +./run --arch aarch64 --baremetal baremetal/arch/aarch64/no_bootloader/multicore_asm.S --cpus 2 --emulator gem5
    +./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.c --cpus 2
    +./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.c --cpus 2 --emulator gem5
    +./run --arch arm --baremetal baremetal/arch/arm/no_bootloader/multicore_asm.S --cpus 2
    +./run --arch arm --baremetal baremetal/arch/arm/no_bootloader/multicore_asm.S --cpus 2 --emulator gem5
    +# TODO not working, hangs.
    +# ./run --arch arm --baremetal baremetal/arch/arm/multicore.c --cpus 2
    +./run --arch arm --baremetal baremetal/arch/arm/multicore.c --cpus 2 --emulator gem5
    -
    ./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.S --cpus 1
    +
    ./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.c --cpus 1
    @@ -26886,7 +27205,7 @@ IN: main
    -
    ./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.S --cpus 1 --emulator gem5
    +
    ./run --arch aarch64 --baremetal baremetal/arch/aarch64/multicore.c --cpus 1 --emulator gem5