diff --git a/README.adoc b/README.adoc index f65a275..a149a8e 100644 --- a/README.adoc +++ b/README.adoc @@ -9849,10 +9849,12 @@ just appears to output both cores intertwined without any clear differentiation. gem5 provides also provides a tracing mechanism documented at: http://www.gem5.org/Trace_Based_Debugging[]: .... -./run --arch aarch64 --eval 'm5 exit' --emulator gem5 --trace Exec +./run --arch aarch64 --eval 'm5 exit' --emulator gem5 --trace ExecAll less "$(./getvar --arch aarch64 run_dir)/trace.txt" .... +Keep in mind however that the disassembly is very broken in several places as of 2019q2, so you can't always trust it. + Output the trace to stdout instead of a file: .... @@ -9891,6 +9893,12 @@ less "$(./getvar gem5_source_dir)/src/cpu/SConscript" less "$(./getvar gem5_source_dir)/src/cpu/exetrace.cc" .... +The most important trace flags to know about are: + +* <> +* `Faults`: CPU exceptions / interrupts, see an example at: <> +* <> + The traces are generated from `DPRINTF(` calls scattered throughout the code. As can be seen on the `Sconstruct`, `Exec` is just an alias that enables a set of flags. @@ -9901,6 +9909,20 @@ We can make the trace smaller by naming the trace file as `trace.txt.gz`, which Enabling tracing made the runtime about 4x slower on the <>, with or without `.gz` compression. +Trace the source lines just like <> with: + +.... +./trace-boot --arch aarch64 --emulator gem5 +./trace2line --arch aarch64 --emulator gem5 +less "$(./getvar --arch aarch64 run_dir)/trace-lines.txt" +.... + +TODO: 7452d399290c9c1fc6366cdad129ef442f323564 `./trace2line` this is too slow and takes hours. QEMU's processing of 170k events takes 7 seconds. gem5's processing is analogous, but there are 140M events, so it should take 7000 seconds ~ 2 hours which seems consistent with what I observe, so maybe there is no way to speed this up... The workaround is to just use gem5's `ExecSymbol` to get function granularity, and then GDB individually if line detail is needed? + +===== gem5 ExecAll trace format + +This debug flag traces all instructions. + The output format is of type: .... @@ -9919,11 +9941,11 @@ There are two types of lines: Breakdown: * `25007500`: time count in some unit. Note how the microops execute at further timestamps. -* `system.cpu`: distinguishes between CPUs when there are more than one +* `system.cpu`: distinguishes between CPUs when there are more than one. For example, running xref:arm-multicore[xrefstyle=full] with two cores produces `system.cpu0` and `system.cpu1` * `T0`: thread number. TODO: https://superuser.com/questions/133082/hyper-threading-and-dual-core-whats-the-difference/995858#995858[hyperthread]? How to play with it? * `@start_kernel`: we are in the `start_kernel` function. Awesome feature! Implemented with libelf https://sourceforge.net/projects/elftoolchain/ copy pasted in-tree `ext/libelf`. To get raw addresses, remove the `ExecSymbol`, which is enabled by `Exec`. This can be done with `Exec,-ExecSymbol`. * `.1` as in `@start_kernel.1`: index of the microop -* `stp`: instruction disassembly. Seems to use `.isa` files dispersed per arch, which is an in house format: http://gem5.org/ISA_description_system +* `stp`: instruction disassembly. Note however that the disassembly of many instructions are very broken as of 2019q2, and you can't just trust them blindly. * `strxi_uop x29, [ureg0]`: microop disassembly. * `MemWrite : D=0x0000000000000000 A=0xffffff8008913f90`: a memory write microop: ** `D` stands for data, and represents the value that was written to memory or to a register @@ -9931,15 +9953,46 @@ Breakdown: The best way to verify all of this is to write some <> -Trace the source lines just like <> with: +===== gem5 Registers trace format + +This flag shows a more detailed register usage than <>. + +For example, if we run in LKMC 0323e81bff1d55b978a4b36b9701570b59b981eb: .... -./trace-boot --arch aarch64 --emulator gem5 -./trace2line --arch aarch64 --emulator gem5 -less "$(./getvar --arch aarch64 run_dir)/trace-lines.txt" +./run --arch aarch64 --baremetal userland/arch/aarch64/add.S --emulator gem5 --trace ExecAll,Registers --trace-stdout .... -TODO: 7452d399290c9c1fc6366cdad129ef442f323564 `./trace2line` this is too slow and takes hours. QEMU's processing of 170k events takes 7 seconds. gem5's processing is analogous, but there are 140M events, so it should take 7000 seconds ~ 2 hours which seems consistent with what I observe, so maybe there is no way to speed this up... The workaround is to just use gem5's `ExecSymbol` to get function granularity, and then GDB individually if line detail is needed? +then the stdout contains: + +.... + 31000: system.cpu A0 T0 : @main_after_prologue : movz x0, #1, #0 : IntAlu : D=0x0000000000000001 flags=(IsInteger) + 31500: system.cpu.[tid:0]: Setting int reg 34 (34) to 0. + 31500: system.cpu.[tid:0]: Reading int reg 0 (0) as 0x1. + 31500: system.cpu.[tid:0]: Setting int reg 1 (1) to 0x3. + 31500: system.cpu A0 T0 : @main_after_prologue+4 : add x1, x0, #2 : IntAlu : D=0x0000000000000003 flags=(IsInteger) + 32000: system.cpu.[tid:0]: Setting int reg 34 (34) to 0. + 32000: system.cpu.[tid:0]: Reading int reg 1 (1) as 0x3. + 32000: system.cpu.[tid:0]: Reading int reg 31 (34) as 0. + 32000: system.cpu.[tid:0]: Setting int reg 0 (0) to 0x3. +.... + +which corresponds to the two following instructions: + +.... +mov x0, 1 +add x1, x0, 2 +.... + +TODO that format is either buggy or is very difficult to understand: + +* what is `34`? Presumably some flags register? +* what do the numbers in parenthesis mean at `31 (34)`? Presumably some flags register? +* why is the first instruction setting `reg 1` and the second one `reg 0`, given that the first sets `x0` and the second `x1`? + +===== gem5 TARMAC traces + +https://stackoverflow.com/questions/54882466/how-to-use-the-tarmac-tracer-with-gem5 === QEMU GUI is unresponsive @@ -10210,6 +10263,10 @@ fails with: Exiting @ tick 18446744073709551615 because simulate() limit reached .... +====== gem5 se.py user mode with 2 or more pthreads fails with because simulate() limit reached + +See bug report at: https://github.com/cirosantilli/linux-kernel-module-cheat/issues/81 + ====== gem5 ARM full system with more than 8 cores https://stackoverflow.com/questions/50248067/how-to-run-a-gem5-arm-aarch64-full-system-simulation-with-fs-py-with-more-than-8 @@ -14054,7 +14111,7 @@ ARMv8-only, likely because in ARMv8 you can't have conditional suffixes for ever Bitwise Bit Clear: clear some bits. .... -dest = `left & ~right` +dest = left & ~right .... Example: link:userland/arch/arm/bic.S[]