diff --git a/index.html b/index.html index 88c6faa..33348d9 100644 --- a/index.html +++ b/index.html @@ -1166,11 +1166,12 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
  • 18.12. gem5 Python scripts without rebuild
  • 18.13. gem5 fs_bigLITTLE
  • 18.14. gem5 unit tests
  • -
  • 18.15. gem5 build options +
  • 18.15. gem5 simulate() limit reached
  • +
  • 18.16. gem5 build options
  • @@ -13917,7 +13918,15 @@ sendkey shift-pgdown
    15.18.2.1. Ctrl Alt Del
    -

    Run /sbin/reboot on guest:

    +

    If you run in QEMU graphic mode:

    +
    +
    +
    +
    ./run --graphic
    +
    +
    +
    +

    and then from the graphic window you enter the keys:

    @@ -13925,7 +13934,15 @@ sendkey shift-pgdown
    -

    Enabled from our rootfs_overlay/etc/inittab:

    +

    then this runs the following command on the guest:

    +
    +
    +
    +
    /sbin/reboot
    +
    +
    +
    +

    This is enabled from our rootfs_overlay/etc/inittab:

    @@ -13933,60 +13950,10 @@ sendkey shift-pgdown
    -

    Linux tries to reboot, and QEMU shutdowns due to the -no-reboot option which we set by default for, see: Section 15.7.1.3, “Exit emulator on panic”.

    +

    This leads Linux to try to reboot, and QEMU shutdowns due to the -no-reboot option which we set by default for, see: Section 15.7.1.3, “Exit emulator on panic”.

    -

    Under the hood, behaviour is controlled by the reboot syscall:

    -
    -
    -
    -
    man 2 reboot
    -
    -
    -
    -

    reboot calls can set either of the these behaviours for Ctrl-Alt-Del:

    -
    -
    - -
    -
    -

    Minimal example:

    +

    Here is a minimal example of Ctrl Alt Del:

    @@ -14038,6 +14005,121 @@ to decide what to do with it.
    +

    Under the hood, behaviour is controlled by the reboot syscall:

    +
    +
    +
    +
    man 2 reboot
    +
    +
    +
    +

    reboot system calls can set either of the these behaviours for Ctrl-Alt-Del:

    +
    +
    + +
    +
    +

    When a BusyBox init is with the signal, it prints the following lines:

    +
    +
    +
    +
    The system is going down NOW!
    +Sent SIGTERM to all processes
    +Sent SIGKILL to all processes
    +Requesting system reboot
    +
    +
    +
    +

    On busybox-1.29.2’s init at init/init.c we see how the kill signals are sent:

    +
    +
    +
    +
    static void run_shutdown_and_kill_processes(void)
    +{
    +	/* Run everything to be run at "shutdown".  This is done _prior_
    +	 * to killing everything, in case people wish to use scripts to
    +	 * shut things down gracefully... */
    +	run_actions(SHUTDOWN);
    +
    +	message(L_CONSOLE | L_LOG, "The system is going down NOW!");
    +
    +	/* Send signals to every process _except_ pid 1 */
    +	kill(-1, SIGTERM);
    +	message(L_CONSOLE, "Sent SIG%s to all processes", "TERM");
    +	sync();
    +	sleep(1);
    +
    +	kill(-1, SIGKILL);
    +	message(L_CONSOLE, "Sent SIG%s to all processes", "KILL");
    +	sync();
    +	/*sleep(1); - callers take care about making a pause */
    +}
    +
    +
    +
    +

    and run_shutdown_and_kill_processes is called from:

    +
    +
    +
    +
    /* The SIGPWR/SIGUSR[12]/SIGTERM handler */
    +static void halt_reboot_pwoff(int sig) NORETURN;
    +static void halt_reboot_pwoff(int sig)
    +
    +
    +
    +

    which also prints the final line:

    +
    +
    +
    +
    	message(L_CONSOLE, "Requesting system %s", m);
    +
    +
    +
    +

    which is set as the signal handler via TODO.

    +
    +

    Bibliography:

    @@ -16646,6 +16728,14 @@ less "$(./getvar --arch aarch64 run_dir)/trace-lines.txt"
  • T0: thread number. TODO: hyperthread? How to play with it?

    +
    +

    config.ini has --param 'system.multi_thread = True' --param 'system.cpu[0].numThreads = 2', but in ARM multicore the first one alone does not produce T1, and with the second one simulation blows up with:

    +
    +
    +
    +
    fatal: fatal condition interrupts.size() != numThreads occurred: CPU system.cpu has 1 interrupt controllers, but is expecting one per thread (2)
    +
    +
  • @start_kernel: we are in the start_kernel function. Awesome feature! Implemented with libelf https://sourceforge.net/projects/elftoolchain/ copy pasted in-tree ext/libelf. To get raw addresses, remove the ExecSymbol, which is enabled by Exec. This can be done with Exec,-ExecSymbol.

    @@ -17242,6 +17332,9 @@ ps Haux | grep qemu | wc +
  • 18.2.2.1.4. gem5 ARM full system with more than 8 cores
    @@ -19289,12 +19382,126 @@ clock=500
    -

    18.15. gem5 build options

    +

    18.15. gem5 simulate() limit reached

    +
    +

    This error happens when the following instruction limits are reached:

    +
    +
    +
    +
    system.cpu[0].max_insts_all_threads
    +system.cpu[0].max_insts_any_thread
    +
    +
    +
    +

    If the parameter is not set, it defaults to 0, which is magic and means the huge maximum value of uint64_t: 0xFFFFFFFFFFFFFFFF, which in practice would require a very long simulation if at least one CPU were live.

    +
    +
    +

    So this usually means all CPUs are in a sleep state, and no events are scheduled in the future, which usually indicates a bug in either gem5 or guest code, leading gem5 to blow up.

    +
    +
    +

    Still, fs.py at gem5 08c79a194d1a3430801c04f37d13216cc9ec1da3 does not exit with non-zero status due to this…​ and so we just parse it out just as for m5 fail…​

    +
    +
    +

    A trivial and very direct way to see message would be:

    +
    +
    +
    +
    ./run \
    +  --emulator gem5 \
    +  --static \
    +  --userland \userland/arch/x86_64/freestanding/linux/hello.S \
    +  --trace-insts-stdout \
    +  -- \
    +  --param 'system.cpu[0].max_insts_all_threads = 3' \
    +;
    +
    +
    +
    +

    which as of lkmc 402059ed22432bb351d42eb10900e5a8e06aa623 runs only the first three instructions and quits!

    +
    +
    +
    +
    info: Entering event queue @ 0.  Starting simulation...
    +      0: system.cpu A0 T0 : @asm_main_after_prologue    : mov   rdi, 0x1
    +      0: system.cpu A0 T0 : @asm_main_after_prologue.0  :   MOV_R_I : limm   rax, 0x1 : IntAlu :  D=0x0000000000000001  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   1000: system.cpu A0 T0 : @asm_main_after_prologue+7    : mov rdi, 0x1
    +   1000: system.cpu A0 T0 : @asm_main_after_prologue+7.0  :   MOV_R_I : limm   rdi, 0x1 : IntAlu :  D=0x0000000000000001  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
    +   2000: system.cpu A0 T0 : @asm_main_after_prologue+14    : lea        rsi, DS:[rip + 0x19]
    +   2000: system.cpu A0 T0 : @asm_main_after_prologue+14.0  :   LEA_R_P : rdip   t7, %ctrl153,  : IntAlu :  D=0x000000000040008d  flags=(IsInteger|IsMicroop|IsDelayedCommit|IsFirstMicroop)
    +   2500: system.cpu A0 T0 : @asm_main_after_prologue+14.1  :   LEA_R_P : lea   rsi, DS:[t7 + 0x19] : IntAlu :  D=0x00000000004000a6  flags=(IsInteger|IsMicroop|IsLastMicroop)
    +Exiting @ tick 3000 because all threads reached the max instruction count
    +
    +
    +
    +

    The message also shows on User mode simulation deadlocks, for example in userland/posix/pthread_deadlock.c:

    +
    +
    +
    +
    ./run \
    +  --emulator gem5 \
    +  --static \
    +  --userland userland/posix/pthread_deadlock.c \
    +  --userland-args 1 \
    +;
    +
    +
    +
    +

    ends in:

    +
    +
    +
    +
    Exiting @ tick 18446744073709551615 because simulate() limit reached
    +
    +
    +
    +

    where 18446744073709551615 is 0xFFFFFFFFFFFFFFFF in decimal.

    +
    +
    +

    And there is a Baremetal example at baremetal/arch/aarch64/no_bootloader/wfe_loop.S that dies on WFE:

    +
    +
    +
    +
    ./run \
    +  --arch aarch64 \
    +  --baremetal baremetal/arch/aarch64/no_bootloader/wfe_loop.S \
    +  --emulator gem5 \
    +  --trace-insts-stdout \
    +;
    +
    +
    +
    +

    which gives:

    +
    +
    +
    +
    info: Entering event queue @ 0.  Starting simulation...
    +      0: system.cpu A0 T0 : @lkmc_start    :   wfe                      : IntAlu :  D=0x0000000000000000  flags=(IsSerializeAfter|IsNonSpeculative|IsQuiesce|IsUnverifiable)
    +   1000: system.cpu A0 T0 : @lkmc_start+4    :   b   <lkmc_start>         : IntAlu :   flags=(IsControl|IsDirectControl|IsUncondControl)
    +   1500: system.cpu A0 T0 : @lkmc_start    :   wfe                      : IntAlu :  D=0x0000000000000000  flags=(IsSerializeAfter|IsNonSpeculative|IsQuiesce|IsUnverifiable)
    +Exiting @ tick 18446744073709551615 because simulate() limit reached
    +
    +
    +
    +

    Other examples of the message:

    +
    +
    + +
    +
    +
    +

    18.16. gem5 build options

    In order to use different build options, you might also want to use gem5 build variants to keep the build outputs separate from one another.

    -

    18.15.1. gem5 debug build

    +

    18.16.1. gem5 debug build

    The gem5.debug executable has optimizations turned off unlike the default gem5.opt, and provides a much better debug experience:

    @@ -19324,7 +19531,7 @@ clock=500
    -

    18.15.2. gem5 clang build

    +

    18.16.2. gem5 clang build

    TODO test properly, benchmark vs GCC.

    @@ -19337,7 +19544,7 @@ clock=500
    -

    18.15.3. gem5 sanitation build

    +

    18.16.3. gem5 sanitation build

    If there gem5 appears to have a C++ undefined behaviour bug, which is often very difficult to track down, you can try to build it with the following extra SCons options:

    @@ -20206,6 +20413,9 @@ git -C "$(./getvar qemu_source_dir)" checkout -

    userland/posix/pthread_count.c

  • +

    userland/posix/pthread_deadlock.c

    +
  • +
  • userland/posix/pthread_self.c

  • @@ -24077,13 +24287,13 @@ ldmia sp!, reglist