trace2line: gem5 support

As noted however, it is potentially too slow to be useful. run: unify gem5 and qemu tracing under -T readme: overhaul tracing documentation from what I've learnt from trace2line
2026-01-26 11:41:35 +01:00 · 2018-04-25 08:41:41 +01:00
parent 14965a40d2
commit ab21ef58de
8 changed files with 131 additions and 50 deletions
--- a/README.adoc
+++ b/README.adoc
@@ -1150,6 +1150,8 @@ Maybe it is because they are being copied around at specific locations instead o

 See also: https://stackoverflow.com/questions/2589845/what-are-the-first-operations-that-the-linux-kernel-executes-on-boot

+<<gem5-tracing>> with `--debug-flags=Exec` does show the right symbols however! So in the worst case, we can just read their source. Amazing.
+
 ==== GDB step debug early boot by address

 One possibility is to run:
@@ -3353,9 +3355,9 @@ You can still send key presses to QEMU however even without the mouse capture, j

 === Tracing

-QEMU has a mechanism to log all instructions executed to a file.
+QEMU can log several different events.

-To do it for the Linux kernel boot we have a helper:
+The most interesting are events which show instructions that QEMU ran, for which we have a helper:

 ....
 ./trace-boot -a x86_64
@@ -3367,6 +3369,20 @@ You can then inspect the instructions with:
 less ./out/x86_64/qemu/trace.txt
 ....

+Enable other specific trace events:
+
+....
+./run -T trace1,trace2
+./qemu-trace2txt -a "$arch"
+less ./out/x86_64/qemu/trace.txt
+....
+
+Get the list of available trace events:
+
+....
+./run -T help
+....
+
 This functionality relies on the following setup:

 * `./configure --enable-trace-backends=simple`. This logs in a binary format to the trace file.
@@ -3395,8 +3411,7 @@ in which the boot appears to hang for a considerable time.
 We can further use Binutils' `addr2line` to get the line that corresponds to each address:

 ....
-./trace-boot -a x86_64
-./trace2line -a x86_64
+./trace-boot -a x86_64 && ./trace2line -a x86_64
 less ./out/x86_64/qemu/trace-lines.txt
 ....

@@ -3435,43 +3450,66 @@ Alternatively, https://github.com/mozilla/rr[`mozilla/rr`] claims it is able to

 gem5 also has a tracing mechanism, as documented at: http://www.gem5.org/Trace_Based_Debugging

-Try it out with:
-
-....
-./run -a aarch64 -E 'm5 exit' -G '--debug-flags=Exec' -g
-....
-
-The trace file is located at:
-
 ....
+./run -a aarch64 -E 'm5 exit' -g -T Exec
 less out/aarch64/gem5/m5out/trace.txt
 ....

-but be warned, it is humongous, at 16Gb.
-
-It made the runtime about 4x slower on the <<p51>>, with or without `.gz` compression.
-
-The list of available debug flags can be found with:
+List all available debug flags:

 ....
 ./run -a aarch64 -G --debug-help -g
 ....

-but for meaningful descriptions you need to look at the source code:
+but to understand most of them you have to look at the source code:

 ....
 less gem5/gem5/src/cpu/SConscript
+less gem5/gem5/src/cpu/exetrace.cc
 ....

-The default `Exec` format reads symbol names from the Linux kernel image and show them, which is pretty awesome if you ask me.
+As can be seen on the `Sconstruct`, `Exec` is just an alias that enables a set of flags.

-TODO can we get just the executed addresses out of gem5? The following gets us closer, but not quite:
+Be warned, the trace is humongous, at 16Gb.
+
+We can make the trace smaller by naming the trace file as `trace.txt.gz`, which enables GZIP compression, but that is not currently exposed on our scripts, since you usually just need something human readable to work on.
+
+Enabling tracing made the runtime about 4x slower on the <<p51>>, with or without `.gz` compression.
+
+The output format is of type:

 ....
-./run -a aarch64 -E 'm5 exit' -G '--debug-flags=ExecEnable,ExecKernel,ExecUse' -g
+25007000: system.cpu T0 : @start_kernel    : stp
+25007000: system.cpu T0 : @start_kernel.0  :   addxi_uop   ureg0, sp, #-112 : IntAlu :  D=0xffffff8008913f90
+25007500: system.cpu T0 : @start_kernel.1  :   strxi_uop   x29, [ureg0] : MemWrite :  D=0x0000000000000000 A=0xffffff8008913f90
+25008000: system.cpu T0 : @start_kernel.2  :   strxi_uop   x30, [ureg0, #8] : MemWrite :  D=0x0000000000000000 A=0xffffff8008913f98
+25008500: system.cpu T0 : @start_kernel.3  :   addxi_uop   sp, ureg0, #0 : IntAlu :  D=0xffffff8008913f90
 ....

-We could of course just pipe it to stdout and `awk` it up.
+There are two types of lines:
+
+* full instructions, as the first line. Only shown if the `ExecMacro` flag is given.
+* micro ops that constitute the instruction, the lines that follow. Yes, `aarch64` also has microops: link:https://superuser.com/questions/934752/do-arm-processors-like-cortex-a9-use-microcode/934755#934755[]. Only shown if the `ExecMicro` flag is given.
+
+Breakdown:
+
+* `25007500`: time count in some unit. Note how the microops execute at further timestamps.
+* `system.cpu`: distinguishes between CPUs when there are more than one
+* `T0`: thread number. TODO: link:https://superuser.com/questions/133082/hyper-threading-and-dual-core-whats-the-difference/995858#995858[hyperthread]? How to play with it?
+* `@start_kernel`: we are in the `start_kernel` function. Awesome feature! Implemented with libelf https://sourceforge.net/projects/elftoolchain/ copy pasted in-tree `ext/libelf`. To get raw addresses, remove the `ExecSymbol`, which is enabled by `Exec`.
+* `.1` as in `@start_kernel.1`: index of the microop
+* `stp`: instruction disassembly. Seems to use `.isa` files dispersed per arch, which is an in house format: http://gem5.org/ISA_description_system
+* `strxi_uop   x29, [ureg0]`: microop disassembly.
+* `MemWrite :  D=0x0000000000000000 A=0xffffff8008913f90`: TODO. Further description of the microops.
+
+Trace the source lines just like <<trace-source-lines,for QEMU>> with:
+
+....
+./trace-boot -a aarch64 -g && ./trace2line -a aarch64 -g
+less ./out/aarch64/gem5/trace-lines.txt
+....
+
+TODO: 7452d399290c9c1fc6366cdad129ef442f323564 `./trace2line` this is too slow and takes hours. QEMU's processing of 170k events takes 7 seconds. gem5's processing is analogous, but there are 140M events, so it should take 7000 seconds ~ 2 hours which seems consistent with what I observe, so maybe there is no way to speed this up... The workaround is to just use gem5's `ExecSymbol` to get function granularity, and then GDB individually if line detail is needed?

 === QEMU GUI is unresponsive

@@ -4287,6 +4325,24 @@ Pass options to the `gem5` executable itself:
 ./run -G '-h' -g
 ....

+=== gem5 exit after a number of instructions
+
+Quit the simulation after `1024` instructions:
+
+....
+./run -g -- -I 1024
+....
+
+Can be nicely checked with <<gem5-tracing>>.
+
+Cycles instead of instructions:
+
+....
+./run -g -- -m 1024
+....
+
+Otherwise the simulation runs forever by default.
+
 === Run multiple gem5 instances at once

 gem5 just assigns new ports if some ports are occupied, so we can do: