diff --git a/README.adoc b/README.adoc index d2632bb..ef997c9 100644 --- a/README.adoc +++ b/README.adoc @@ -1150,6 +1150,8 @@ Maybe it is because they are being copied around at specific locations instead o See also: https://stackoverflow.com/questions/2589845/what-are-the-first-operations-that-the-linux-kernel-executes-on-boot +<> with `--debug-flags=Exec` does show the right symbols however! So in the worst case, we can just read their source. Amazing. + ==== GDB step debug early boot by address One possibility is to run: @@ -3353,9 +3355,9 @@ You can still send key presses to QEMU however even without the mouse capture, j === Tracing -QEMU has a mechanism to log all instructions executed to a file. +QEMU can log several different events. -To do it for the Linux kernel boot we have a helper: +The most interesting are events which show instructions that QEMU ran, for which we have a helper: .... ./trace-boot -a x86_64 @@ -3367,6 +3369,20 @@ You can then inspect the instructions with: less ./out/x86_64/qemu/trace.txt .... +Enable other specific trace events: + +.... +./run -T trace1,trace2 +./qemu-trace2txt -a "$arch" +less ./out/x86_64/qemu/trace.txt +.... + +Get the list of available trace events: + +.... +./run -T help +.... + This functionality relies on the following setup: * `./configure --enable-trace-backends=simple`. This logs in a binary format to the trace file. @@ -3395,8 +3411,7 @@ in which the boot appears to hang for a considerable time. We can further use Binutils' `addr2line` to get the line that corresponds to each address: .... -./trace-boot -a x86_64 -./trace2line -a x86_64 +./trace-boot -a x86_64 && ./trace2line -a x86_64 less ./out/x86_64/qemu/trace-lines.txt .... @@ -3435,43 +3450,66 @@ Alternatively, https://github.com/mozilla/rr[`mozilla/rr`] claims it is able to gem5 also has a tracing mechanism, as documented at: http://www.gem5.org/Trace_Based_Debugging -Try it out with: - -.... -./run -a aarch64 -E 'm5 exit' -G '--debug-flags=Exec' -g -.... - -The trace file is located at: - .... +./run -a aarch64 -E 'm5 exit' -g -T Exec less out/aarch64/gem5/m5out/trace.txt .... -but be warned, it is humongous, at 16Gb. - -It made the runtime about 4x slower on the <>, with or without `.gz` compression. - -The list of available debug flags can be found with: +List all available debug flags: .... ./run -a aarch64 -G --debug-help -g .... -but for meaningful descriptions you need to look at the source code: +but to understand most of them you have to look at the source code: .... less gem5/gem5/src/cpu/SConscript +less gem5/gem5/src/cpu/exetrace.cc .... -The default `Exec` format reads symbol names from the Linux kernel image and show them, which is pretty awesome if you ask me. +As can be seen on the `Sconstruct`, `Exec` is just an alias that enables a set of flags. -TODO can we get just the executed addresses out of gem5? The following gets us closer, but not quite: +Be warned, the trace is humongous, at 16Gb. + +We can make the trace smaller by naming the trace file as `trace.txt.gz`, which enables GZIP compression, but that is not currently exposed on our scripts, since you usually just need something human readable to work on. + +Enabling tracing made the runtime about 4x slower on the <>, with or without `.gz` compression. + +The output format is of type: .... -./run -a aarch64 -E 'm5 exit' -G '--debug-flags=ExecEnable,ExecKernel,ExecUse' -g +25007000: system.cpu T0 : @start_kernel : stp +25007000: system.cpu T0 : @start_kernel.0 : addxi_uop ureg0, sp, #-112 : IntAlu : D=0xffffff8008913f90 +25007500: system.cpu T0 : @start_kernel.1 : strxi_uop x29, [ureg0] : MemWrite : D=0x0000000000000000 A=0xffffff8008913f90 +25008000: system.cpu T0 : @start_kernel.2 : strxi_uop x30, [ureg0, #8] : MemWrite : D=0x0000000000000000 A=0xffffff8008913f98 +25008500: system.cpu T0 : @start_kernel.3 : addxi_uop sp, ureg0, #0 : IntAlu : D=0xffffff8008913f90 .... -We could of course just pipe it to stdout and `awk` it up. +There are two types of lines: + +* full instructions, as the first line. Only shown if the `ExecMacro` flag is given. +* micro ops that constitute the instruction, the lines that follow. Yes, `aarch64` also has microops: link:https://superuser.com/questions/934752/do-arm-processors-like-cortex-a9-use-microcode/934755#934755[]. Only shown if the `ExecMicro` flag is given. + +Breakdown: + +* `25007500`: time count in some unit. Note how the microops execute at further timestamps. +* `system.cpu`: distinguishes between CPUs when there are more than one +* `T0`: thread number. TODO: link:https://superuser.com/questions/133082/hyper-threading-and-dual-core-whats-the-difference/995858#995858[hyperthread]? How to play with it? +* `@start_kernel`: we are in the `start_kernel` function. Awesome feature! Implemented with libelf https://sourceforge.net/projects/elftoolchain/ copy pasted in-tree `ext/libelf`. To get raw addresses, remove the `ExecSymbol`, which is enabled by `Exec`. +* `.1` as in `@start_kernel.1`: index of the microop +* `stp`: instruction disassembly. Seems to use `.isa` files dispersed per arch, which is an in house format: http://gem5.org/ISA_description_system +* `strxi_uop x29, [ureg0]`: microop disassembly. +* `MemWrite : D=0x0000000000000000 A=0xffffff8008913f90`: TODO. Further description of the microops. + +Trace the source lines just like <> with: + +.... +./trace-boot -a aarch64 -g && ./trace2line -a aarch64 -g +less ./out/aarch64/gem5/trace-lines.txt +.... + +TODO: 7452d399290c9c1fc6366cdad129ef442f323564 `./trace2line` this is too slow and takes hours. QEMU's processing of 170k events takes 7 seconds. gem5's processing is analogous, but there are 140M events, so it should take 7000 seconds ~ 2 hours which seems consistent with what I observe, so maybe there is no way to speed this up... The workaround is to just use gem5's `ExecSymbol` to get function granularity, and then GDB individually if line detail is needed? === QEMU GUI is unresponsive @@ -4287,6 +4325,24 @@ Pass options to the `gem5` executable itself: ./run -G '-h' -g .... +=== gem5 exit after a number of instructions + +Quit the simulation after `1024` instructions: + +.... +./run -g -- -I 1024 +.... + +Can be nicely checked with <>. + +Cycles instead of instructions: + +.... +./run -g -- -m 1024 +.... + +Otherwise the simulation runs forever by default. + === Run multiple gem5 instances at once gem5 just assigns new ports if some ports are occupied, so we can do: diff --git a/bench-boot b/bench-boot index 7f20b45..43b245f 100755 --- a/bench-boot +++ b/bench-boot @@ -12,7 +12,7 @@ qemu_insts() ( arch="$1" ./qemu-trace2txt -a "$arch" set_common_vars "$arch" - printf "instructions $(wc -l "${qemu_trace_txt_file}" | cut -d' ' -f1)\n" >> "$common_bench_boot" + printf "instructions $(wc -l "${common_trace_txt_file}" | cut -d' ' -f1)\n" >> "$common_bench_boot" ) rm -f "${common_bench_boot}" diff --git a/common b/common index c9dc064..3892346 100644 --- a/common +++ b/common @@ -36,11 +36,12 @@ set_common_vars() { gem5_out_dir="${out_arch_dir}/gem5" m5out_dir="${gem5_out_dir}/m5out" qemu_out_dir="${out_arch_dir}/qemu" - qemu_trace_txt_file="${qemu_out_dir}/trace.txt" if "$gem5"; then common_out_run_dir="$gem5_out_dir" + common_trace_txt_file="${common_out_run_dir}/m5out/trace.txt" else common_out_run_dir="$qemu_out_dir" + common_trace_txt_file="${common_out_run_dir}/trace.txt" fi } root_dir="$(pwd)" diff --git a/qemu-trace2txt b/qemu-trace2txt index bd7c148..19196f5 100755 --- a/qemu-trace2txt +++ b/qemu-trace2txt @@ -13,4 +13,4 @@ set_common_vars "$arch" false ./qemu/scripts/simpletrace.py \ "${build_dir}/host-qemu-custom/trace-events-all" \ "${qemu_out_dir}/trace.bin" \ -> "${qemu_trace_txt_file}" +> "${common_trace_txt_file}" diff --git a/run b/run index 4a6ba65..fde1a32 100755 --- a/run +++ b/run @@ -28,9 +28,10 @@ nographic=true root= tmux=false tmux_args= +trace_enabled=false # A dummy value that is already turned on by default and does not produce large output, # just to prevent QEMU from emitting a warning that '' is not valid. -trace_enable=pr_manager_run +trace_type=pr_manager_run vnc= while getopts a:c:DdE:e:F:f:G:ghIiKkm:T:U:uVx OPT; do case "$OPT" in @@ -93,7 +94,8 @@ while getopts a:c:DdE:e:F:f:G:ghIiKkm:T:U:uVx OPT; do memory="$OPTARG" ;; T) - trace_enable="$OPTARG" + trace_enabled=true + trace_type="$OPTARG" ;; U) tmux_args="$OPTARG" @@ -155,6 +157,10 @@ if "$gem5"; then else gem5_arch=ARM fi + if "$trace_enabled"; then + gem5opts="${gem5opts} --debug-flags='${trace_type}' \\ +" + fi gem5_common="\ M5_PATH='${gem5_build_dir}/system' \\ ${debug_vm} \ @@ -203,7 +209,7 @@ ${debug_vm} \ -monitor telnet::45454,server,nowait \\ -netdev user,hostfwd=tcp::45455-:45455,hostfwd=tcp::45456-:22,id=net0 \\ -smp '${cpus}' \\ --trace 'enable=${trace_enable},file=${qemu_out_dir}/trace.bin' \\ +-trace 'enable=${trace_type},file=${qemu_out_dir}/trace.bin' \\ -virtfs 'local,path=${p9_dir},mount_tag=host_scratch,security_model=mapped,id=host_scratch' \\ -virtfs 'local,path=${buildroot_out_dir}/build,mount_tag=host_out,security_model=mapped,id=host_out' \\ ${vnc}" diff --git a/run-usage.adoc b/run-usage.adoc index 692f069..bba829c 100644 --- a/run-usage.adoc +++ b/run-usage.adoc @@ -40,11 +40,12 @@ The default is the minimum amount that boots all archs without extra options added. Anything lower will lead some arch to fail to boot. Any -|`-T` | |Enable extra QEMU trace events. - `./configure --enable-trace-backends=simple` seems to enable - some by default, e.g. `pr_manager_run`, and I don't know how to - get rid of them. But those events are rare and should not impact - performance, so we just ignore them for now. +|`-T` |`TRACE_TYPES` |Set trace events to be enabled. + If not given, gem5 tracing is completely disabled, while QEMU tracing + is enabled but uses default traces that are very rare and don't affect + performance. `./configure --enable-trace-backends=simple` seems to enable + some traces by default, e.g. `pr_manager_run`, and I don't know how to + get rid of them. |`-U` | |Pass extra parameters to the program running on the `-u` tmux split. |`-u` | |Create a tmUx split the window. You must already be inside of a `tmux` session to use this option. diff --git a/trace-boot b/trace-boot index 2662677..ee94389 100755 --- a/trace-boot +++ b/trace-boot @@ -2,22 +2,30 @@ set -eu . common set -- ${cli_trace_boot:-} "$@" -while getopts a: OPT; do +while getopts a:g OPT; do case "$OPT" in a) arch="$OPTARG" ;; + g) + gem5=true + ;; esac done -set_common_vars "$arch" false -time ./run -a "$arch" -e 'init=/poweroff.out' -T exec_tb -time ./qemu-trace2txt -a "$arch" - -# Instruction count. -# We could put this on a separate script, but it just adds more arch boilerplate to a new script. -# So let's just leave it here for now since it did not add a significant processing time. -echo "instructions $(wc -l "${qemu_trace_txt_file}" | cut -d' ' -f1)" -entry_addr=$("${root_dir}/runtc" readelf -h "${build_dir}/linux-custom/vmlinux" | grep 'Entry point address' | sed -E 's/.*: *//') -echo "entry_address ${entry_addr}" -sed "/${entry_addr}/q" "${qemu_trace_txt_file}" >"${qemu_out_dir}/trace-boot.txt" -echo "instructions_firmware $(wc -l "${qemu_out_dir}/trace-boot.txt" | cut -d' ' -f1)" +shift "$(($OPTIND - 1))" +set_common_vars "$arch" "$gem5" +if "$gem5"; then + # Exactly what Exec enabled, but without ExecSymbol and ExecMicro. + time ./run -a aarch64 -E 'm5 exit' -g -T 'ExecEnable,ExecTicks,ExecOpClass,ExecThread,ExecEffAddr,ExecResult,ExecMacro,ExecFaulting,ExecUser,ExecKernel' "$@" +else + time ./run -a "$arch" -e 'init=/poweroff.out' -T exec_tb "$@" + time ./qemu-trace2txt -a "$arch" + # Instruction count. + # We could put this on a separate script, but it just adds more arch boilerplate to a new script. + # So let's just leave it here for now since it did not add a significant processing time. + echo "instructions $(wc -l "${common_trace_txt_file}" | cut -d' ' -f1)" + entry_addr=$("${root_dir}/runtc" readelf -h "${build_dir}/linux-custom/vmlinux" | grep 'Entry point address' | sed -E 's/.*: *//') + echo "entry_address ${entry_addr}" + sed "/${entry_addr}/q" "${common_trace_txt_file}" >"${qemu_out_dir}/trace-boot.txt" + echo "instructions_firmware $(wc -l "${qemu_out_dir}/trace-boot.txt" | cut -d' ' -f1)" +fi diff --git a/trace2line b/trace2line index 37104dd..84fe21e 100755 --- a/trace2line +++ b/trace2line @@ -2,17 +2,26 @@ set -eu . common set -- ${cli_trace2line:-} "$@" -while getopts a: OPT; do +while getopts a:g OPT; do case "$OPT" in a) arch="$OPTARG" ;; + g) + gem5=true + ;; esac done -set_common_vars "$arch" false +set_common_vars "$arch" "$gem5" kernel_dir="${build_dir}/linux-custom" -sed -E 's/.*pc=//' "${qemu_out_dir}/trace.txt" | \ +( + if "$gem5"; then + sed -r 's/^.* (0x[^. ]*)[. ].*/\1/' "$common_trace_txt_file" + else + sed -r 's/.*pc=//' "$common_trace_txt_file" + fi +) | \ xargs "${host_dir}/bin/${arch}-linux-addr2line" -e "${kernel_dir}/vmlinux" -fp | \ sed -E "s|at ${kernel_dir}/(\./\|)||" | \ uniq -c \ -> "${qemu_out_dir}/trace-lines.txt" +> "${common_out_run_dir}/trace-lines.txt"