33. Android
+
@@ -2318,7 +2326,7 @@ pre{ white-space:pre }
If you don’t know which one to go for, start with QEMU Buildroot setup getting started.
@@ -2415,7 +2423,7 @@ pre{ white-space:pre }
Reserve 12Gb of disk and run:
@@ -2432,7 +2440,7 @@ cd linux-kernel-module-cheat
You don’t need to clone recursively even though we have .git submodules: download-dependencies fetches just the submodules that you need for this build to save time.
The initial build will take a while (30 minutes to 2 hours) to clone and build, see Benchmark builds for more details.
@@ -2515,7 +2523,7 @@ hello2 cleanup
-
While hacking QEMU, you will likely want to GDB step its source. That is trivial since QEMU is just another userland program like any other, but our setup has a shortcut to make it even more convenient, see: Section 18.8, “Debug the emulator”.
+
While hacking QEMU, you will likely want to GDB step its source. That is trivial since QEMU is just another userland program like any other, but our setup has a shortcut to make it even more convenient, see: Section 22.8, “Debug the emulator”.
@@ -3247,7 +3255,7 @@ j = 0
The downside of gem5 much slower than QEMU because of the greater simulation detail.
@@ -3327,7 +3335,7 @@ j = 0
Good next steps are:
@@ -3356,7 +3364,7 @@ j = 0
This repository has been tested inside clean Docker containers.
-
This is a good option if you are on a Linux host, but the native setup failed due to your weird host distribution, and you have better things to do with your life than to debug it. See also: Section 34.1, “Supported hosts”.
+
This is a good option if you are on a Linux host, but the native setup failed due to your weird host distribution, and you have better things to do with your life than to debug it. See also: Section 37.1, “Supported hosts”.
@@ -4318,7 +4326,7 @@ error: simulation error detected by parsing logs
Note that ./build-baremetal requires the --emulator gem5 option, and generates separate executable images for both, as can be seen from:
@@ -4351,7 +4359,7 @@ echo "$(./getvar --arch aarch64 --emulator gem5 image)"
This generates yet new separate images with new magic constants:
@@ -4366,10 +4374,10 @@ echo "$(./getvar --arch aarch64 --baremetal userland/c/hello.c --emulator gem5 -
But just stick to newer and better VExpress_GEM5_V1 unless you have a good reason to use RealViewPBX.
The following subjects are particularly important:
@@ -4434,7 +4442,7 @@ xdg-open README.html
@@ -5220,6 +5228,52 @@ echo 'file kernel/module.c +p' > /sys/kernel/debug/dynamic_debug/control
+
+
In the specific case of gem5 aarch64 at least:
+
+
+
+-
+
gem5 relocates the kernel in memory to a fixed location, see e.g. https://gem5.atlassian.net/browse/GEM5-787
+
+-
+
--param 'system.workload.early_kernel_symbols=True should in theory duplicate the symbols to the correct physical location, but it was broken at one point: https://gem5.atlassian.net/browse/GEM5-785
+
+-
+
gem5 executes directly from vmlinux, so there is no decompression code involved, so you actually immediately start running the "true" first instruction from head.S as described at: https://stackoverflow.com/questions/18266063/does-linux-kernel-have-main-function/33422401#33422401
+
+-
+
once the MMU gets turned on at kernel symbol __primary_switched, the virtual address matches the ELF symbols, and you start seeing correct symbols without the need for early_kernel_symbols. This can be observed clearly with function_trace = True: https://stackoverflow.com/questions/64049487/how-to-trace-executed-guest-function-symbol-names-with-their-timestamp-in-gem5/64049488#64049488 which produces:
+
+
+
0: _kernel_flags_le_lo32 (12500)
+12500: __crc_tcp_add_backlog (1000)
+13500: __crc_crypto_alg_tested (6500)
+20000: __crc_tcp_add_backlog (10000)
+30000: __crc_crypto_alg_tested (500)
+30500: __crc_scsi_is_host_device (5000)
+35500: __crc_crypto_alg_tested (1500)
+37000: __crc_scsi_is_host_device (4000)
+41000: __crc_crypto_alg_tested (3000)
+44000: __crc_tcp_add_backlog (263500)
+307500: __crc_crypto_alg_tested (975500)
+1283000: __crc_tcp_add_backlog (77191500)
+78474500: __crc_crypto_alg_tested (1000)
+78475500: __crc_scsi_is_host_device (19500)
+78495000: __crc_crypto_alg_tested (500)
+78495500: __crc_scsi_is_host_device (13500)
+78509000: __primary_switched (14000)
+78523000: memset (21118000)
+99641000: __primary_switched (2500)
+99643500: start_kernel (11000)
+
+
+
+
so we see that primary_switched is the first non-trash symbol (non-crc_* and non-kernel_flags*, which are just informative symbols, not actual executable code)
+
+
+
+
@@ -5241,9 +5295,6 @@ echo 'file kernel/module.c +p' > /sys/kernel/debug/dynamic_debug/control
gem5 ExecAll trace format>> however does show the right symbols however! This could be because gem5 uses vmlinux to boot, which QEMU uses the compressed version, and as mentioned on the Stack Overflow answer, the entry point is actually a tiny decompresser routine.
-
I also tried to hack run-gdb with:
@@ -5263,7 +5314,7 @@ echo 'file kernel/module.c +p' > /sys/kernel/debug/dynamic_debug/control
and no I do have the symbols from arch/arm/boot/compressed/vmlinux', but the breaks still don’t work.
-
v4.19 also added a CONFIG_HAVE_KERNEL_UNCOMPRESSED=y option for having the kernel uncompressed which could make following the startup easier, but it is only available on s390. aarch64 however is already uncompressed by default, so might be the easiest one. See also: Section 15.20.1, “vmlinux vs bzImage vs zImage vs Image”.
+
v4.19 also added a CONFIG_HAVE_KERNEL_UNCOMPRESSED=y option for having the kernel uncompressed which could make following the startup easier, but it is only available on s390. aarch64 however is already uncompressed by default, so might be the easiest one. See also: Section 16.20.1, “vmlinux vs bzImage vs zImage vs Image”.
You then need the associated KERNEL_UNCOMPRESSED to enable it if available:
@@ -5275,6 +5326,92 @@ echo 'file kernel/module.c +p' > /sys/kernel/debug/dynamic_debug/control
depends on HAVE_KERNEL_UNCOMPRESSED
+
+
+
+
+
+
+-
+
the bootloader goes in in WFE
+
+-
+
the kernel writes the entry point to the secondary CPU (the address of secondary_holding_pen) with CPU0 at the address given to the kernel in the cpu-release-addr of the DTB
+
+-
+
the kernel wakes up the bootloader with a SEV, and the bootloader boots to the address the kernel told it
+
+
+
+
+
+
Here’s the code that writes the address and does SEV:
+
+
+
+
static int smp_spin_table_cpu_prepare(unsigned int cpu)
+{
+ __le64 __iomem *release_addr;
+
+ if (!cpu_release_addr[cpu])
+ return -ENODEV;
+
+ /*
+ * The cpu-release-addr may or may not be inside the linear mapping.
+ * As ioremap_cache will either give us a new mapping or reuse the
+ * existing linear mapping, we can use it to cover both cases. In
+ * either case the memory will be MT_NORMAL.
+ */
+ release_addr = ioremap_cache(cpu_release_addr[cpu],
+ sizeof(*release_addr));
+ if (!release_addr)
+ return -ENOMEM;
+
+ /*
+ * We write the release address as LE regardless of the native
+ * endianess of the kernel. Therefore, any boot-loaders that
+ * read this address need to convert this address to the
+ * boot-loader's endianess before jumping. This is mandated by
+ * the boot protocol.
+ */
+ writeq_relaxed(__pa_symbol(secondary_holding_pen), release_addr);
+ __flush_dcache_area((__force void *)release_addr,
+ sizeof(*release_addr));
+
+ /*
+ * Send an event to wake up the secondary CPU.
+ */
+ sev();
+
+
+
+
and here’s the code that reads the value from the DTB:
+
+
+
+
static int smp_spin_table_cpu_init(unsigned int cpu)
+{
+ struct device_node *dn;
+ int ret;
+
+ dn = of_get_cpu_node(cpu, NULL);
+ if (!dn)
+ return -ENODEV;
+
+ /*
+ * Determine the address from which the CPU is polling.
+ */
+ ret = of_property_read_u64(dn, "cpu-release-addr",
+ &cpu_release_addr[cpu]);
+
+
+
@@ -5680,7 +5817,7 @@ Breakpoint 3 at 0xffffffff811615e3: fdget_pos. (9 locations)
We can set and get which cores the Linux kernel allows a program to run on with sched_getaffinity and sched_setaffinity:
@@ -5728,7 +5865,7 @@ sched_getcpu = 0
It also automatically chooses between init= and rcinit= for you, see: Section 6.3, “Path to init”
@@ -7337,7 +7474,7 @@ cat f
which can be good for automated tests, as it ensures that you are using a pristine unmodified system image every time.
One downside of this method is that it has to put the entire filesystem into memory, and could lead to a panic:
@@ -7535,7 +7672,7 @@ cat f
To do this failed test, we automatically pass a dummy disk image as of gem5 7fa4c946386e7207ad5859e8ade0bbfc14000d91 since the scripts don’t handle a missing --disk-image well, much like is currently done for Baremetal.
@@ -8060,7 +8197,7 @@ qw er
@@ -8113,7 +8250,7 @@ qw er
The gem5 tests require building statically with build id static, see also: Section 10.7, “gem5 syscall emulation mode”. TODO automate this better.
@@ -8154,7 +8291,7 @@ qw er
@@ -9872,7 +10009,7 @@ xeyes
-
We disable networking by default because it starts an userland process, and we want to keep the number of userland processes to a minimum to make the system more understandable as explained at: Section 34.20.3, “Resource tradeoff guidelines”
+
We disable networking by default because it starts an userland process, and we want to keep the number of userland processes to a minimum to make the system more understandable as explained at: Section 37.20.3, “Resource tradeoff guidelines”
To enable networking on Buildroot, simply run:
@@ -10212,27 +10349,87 @@ mount -t 9p -o trans=virtio,version=9p2000.L host0 /mnt/my9p
-
TODO seems possible! Lets do it:
-
-
-
From the source, there is just one exported tag named gem5, so we could try on the guest:
+
Enable it by passing the --vio-9p option on the fs.py gem5 command line:
+
+
+
+
./run --arch aarch64 --emulator gem5 -- --vio-9p
+
+
+
mkdir -p /mnt/9p/gem5
-mount -t 9p -o trans=virtio,version=9p2000.L gem5 /mnt/9p/data
+mount -t 9p -o trans=virtio,version=9p2000.L,aname=/path/to/linux-kernel-module-cheat/out/run/gem5/aarch64/0/m5out/9p/share gem5 /mnt/9p/gem5
+echo asdf > /mnt/9p/gem5/qwer
+
+
Yes, you have to pass the full path to the directory on the host. Yes, this is horrible.
+
+
+
The shared directory is:
+
+
+
+
out/run/gem5/aarch64/0/m5out/9p/share
+
+
+
+
so we can observe the file the guest wrote from the host with:
+
+
+
+
out/run/gem5/aarch64/0/m5out/9p/share/qwer
+
+
+
+
+
+
echo zxvc > out/run/gem5/aarch64/0/m5out/9p/share/qwer
+
+
+
+
is now visible from the guest:
+
+
+
+
Checkpoint restore with an open mount will likely fail because gem5 uses an ugly external executable to implement diod. The protocol is not very complex, and QEMU implements it in-tree, which is what gem5 should do as well at some point.
+
+
+
Also checkpoint without --vio-9p and restore with --vio-9p did not work either, the mount fails.
+
+
+
However, this did work, on guest:
+
+
+
+
unmount /mnt/9p/gem5
+m5 checkpoint
+
+
+
+
then restore with the detalied CPU of interest e.g.
+
+
+
+
./run --arch aarch64 --emulator gem5 -- --vio-9p --cpu-type DerivO3CPU --caches
+
+
+
+
Tested on gem5 b2847f43c91e27f43bd4ac08abd528efcf00f2fd, LKMC 52a5fdd7c1d6eadc5900fc76e128995d4849aada.
+
@@ -10310,12 +10507,42 @@ mount -t nfs 10.0.2.2:/tmp /mnt/nfs
+
+
+
+
-
+
-
+
@@ -10360,7 +10587,7 @@ CONFIG_IKCONFIG_PROC=y
The following options can all be used together, sorted by decreasing config setting power precedence:
@@ -10406,11 +10633,11 @@ cp "$(./getvar linux_build_dir)/defconfig" data/myconfig
-
+
Get the build config in guest:
@@ -10468,14 +10695,14 @@ CONFIG_IKCONFIG_PROC=y
-
+
By default, build-linux generates a .config that is a mixture of:
-
-
a base config extracted from Buildroot’s minimal per machine .config, which has the minimal options needed to boot as explained at: Section 15.1.3.1, “About Buildroot’s kernel configs”.
+a base config extracted from Buildroot’s minimal per machine .config, which has the minimal options needed to boot as explained at: Section 16.1.3.1, “About Buildroot’s kernel configs”.
-
small overlays put top of that
@@ -10516,18 +10743,18 @@ CONFIG_IKCONFIG_PROC=y
-
+
@@ -10538,7 +10765,7 @@ CONFIG_IKCONFIG_PROC=y
arm, on the other hand, uses buildroot/configs/qemu_arm_vexpress_defconfig, which contains BR2_LINUX_KERNEL_DEFCONFIG="vexpress", and therefore just does a make vexpress_defconfig, and gets its config from the Linux kernel tree itself.
-
+
To boot defconfig from disk on Linux and see a shell, all we need is these missing virtio options:
@@ -10629,12 +10856,12 @@ CONFIG_IKCONFIG_PROC=y
-
+
linux_config/min contains minimal tweaks required to boot gem5 or for using our slightly different QEMU command line options than Buildroot on all archs.
Having the same config working for both QEMU and gem5 (oh, the hours of bisection) means that you can deal with functional matters in QEMU, which runs much faster, and switch to gem5 only for performance issues.
@@ -10661,14 +10888,14 @@ CONFIG_IKCONFIG_PROC=y
-
+
Other configs which we had previously tested at 4e0d9af81fcce2ce4e777cb82a1990d7c2ca7c1e are:
-
+
-
+
We try to use the latest possible kernel major release version.
@@ -10704,7 +10931,7 @@ git log | grep -E ' Linux [0-9]+\.' | head
-
+
During update all you kernel modules may break since the kernel API is not stable.
@@ -10721,15 +10948,15 @@ git log | grep -E ' Linux [0-9]+\.' | head
This also makes this repo the perfect setup to develop the Linux kernel.
-
In case something breaks while updating the Linux kernel, you can try to bisect it to understand the root cause, see: Section 34.17, “Bisection”.
+
In case something breaks while updating the Linux kernel, you can try to bisect it to understand the root cause, see: Section 37.17, “Bisection”.
-
+
-
Because the kernel is so central to this repository, almost all tests must be re-run, so basically just follow the full testing procedure described at: Section 34.16, “Test this repo”. The only tests that can be skipped are essentially the Baremetal tests.
+
Because the kernel is so central to this repository, almost all tests must be re-run, so basically just follow the full testing procedure described at: Section 37.16, “Test this repo”. The only tests that can be skipped are essentially the Baremetal tests.
Before comitting, don’t forget to update:
@@ -10757,7 +10984,7 @@ git log | grep -E ' Linux [0-9]+\.' | head
-
+
The kernel is not forward compatible, however, so downgrading the Linux kernel requires downgrading the userland too to the latest Buildroot branch that supports it.
@@ -10833,7 +11060,7 @@ git log | grep -E ' Linux [0-9]+\.' | head
-
+
Bootloaders can pass a string as input to the Linux kernel when it is booting to control its behaviour, much like the execve system call does to userland processes.
@@ -10894,7 +11121,7 @@ git log | grep -E ' Linux [0-9]+\.' | head
-
+
Double quotes can be used to escape spaces as in opt="a b", but double quotes themselves cannot be escaped, e.g. opt"a\"b"
@@ -10903,7 +11130,7 @@ git log | grep -E ' Linux [0-9]+\.' | head
-
+
@@ -10946,7 +11173,7 @@ git log | grep -E ' Linux [0-9]+\.' | head
-
+
By default, the Linux kernel mounts the root filesystem as readonly. TODO rationale?
@@ -11018,7 +11245,7 @@ mount
-
+
Disable userland address space randomization. Test it out by running rand_check.out twice:
@@ -11042,7 +11269,7 @@ mount
-
+
printk is the most simple and widely used way of getting information from the kernel, so you should familiarize yourself with its basic configuration.
@@ -11122,10 +11349,10 @@ mount
-
+
The current printk level can be obtained with:
@@ -11312,7 +11539,7 @@ early_param("quiet", quiet_kernel);
-
+
./run --kernel-cli 'ignore_loglevel'
@@ -11331,7 +11558,7 @@ early_param("quiet", quiet_kernel);
-
+
@@ -11429,7 +11656,7 @@ insmod myprintk.ko
Get ready for the noisiest boot ever, I think it overflows the printk buffer and funny things happen.
-
+
When CONFIG_DYNAMIC_DEBUG is set, printk(KERN_DEBUG is not the exact same as pr_debug( since printk(KERN_DEBUG messages are visible with:
@@ -11478,9 +11705,9 @@ insmod myprintk.ko
-
+
-
+
@@ -11553,7 +11780,7 @@ parm: i:my favorite int
-
+
@@ -11580,7 +11807,7 @@ cat /sys/kernel/debug/lkmc_params
-
+
One module can depend on symbols of another module that are exported with EXPORT_SYMBOL:
@@ -11693,7 +11920,7 @@ extra/dep.ko:
TODO: what for, and at which point point does Buildroot / BusyBox generate that file?
-
+
Unlike insmod, modprobe deals with kernel module dependencies for us.
@@ -11779,7 +12006,7 @@ buildroot_dep 16384 1 buildroot_dep2
-
+
Module metadata is stored on module files at compile time. Some of the fields can be retrieved through the THIS_MODULE struct module:
@@ -11899,7 +12126,7 @@ vermagic: 4.17.0 SMP mod_unload modversions
-
+
Vermagic is a magic string present in the kernel and on MODULE_INFO of kernel modules. It is used to verify that the kernel module was compiled against a compatible kernel version and relevant configuration:
@@ -11978,7 +12205,7 @@ vermagic: 4.17.0 SMP mod_unload modversions
-
+
init_module and cleanup_module are an older alternative to the module_init and module_exit macros:
@@ -12005,7 +12232,7 @@ cleanup_module
-
+
It is generally hard / impossible to use floating point operations in the kernel. TODO understand details.
@@ -12076,7 +12303,7 @@ cleanup_module
-
+
To test out kernel panics and oops in controlled circumstances, try out the modules:
@@ -12140,7 +12367,7 @@ insmod oops.ko
-
+
On panic, the kernel dies, and so does our terminal.
@@ -12190,7 +12417,7 @@ Kernel Offset: disabled
-
+
The log shows which module each symbol belongs to if any, e.g.:
@@ -12256,25 +12483,25 @@ Kernel Offset: disabled
-
+
Basically just calls panic("BUG!") for most archs.
-
+
For testing purposes, it is very useful to quit the emulator automatically with exit status non zero in case of kernel panic, instead of just hanging forever.
-
+
-
-
panic=-1 command line option which reboots the kernel immediately on panic, see: Section 15.6.1.4, “Reboot on panic”
+panic=-1 command line option which reboots the kernel immediately on panic, see: Section 16.6.1.4, “Reboot on panic”
-
QEMU -no-reboot, which makes QEMU exit when the guest tries to reboot
@@ -12292,7 +12519,7 @@ Kernel Offset: disabled
-
+
gem5 9048ef0ffbf21bedb803b785fb68f83e95c04db8 (January 2019) can detect panics automatically if the option system.panic_on_panic is on.
@@ -12353,7 +12580,7 @@ Kernel Offset: disabled
-
+
Make the kernel reboot after n seconds after panic:
@@ -12383,7 +12610,7 @@ Kernel Offset: disabled
-
+
If CONFIG_KALLSYMS=n, then addresses are shown on traces instead of symbol plus offset.
@@ -12422,7 +12649,7 @@ Kernel Offset: disabled
-
+
On oops, the shell still lives after.
@@ -12539,7 +12766,7 @@ CR2: 0000000000000000
-
+
The dump_stack function produces a stack trace much like panic and oops, but causes no problems and we return to the normal control flow, and can cleanly remove the module afterwards:
@@ -12553,7 +12780,7 @@ CR2: 0000000000000000
-
+
The WARN_ON macro basically just calls dump_stack.
@@ -12574,7 +12801,7 @@ insmod warn_on.ko
-
+
Let’s learn how to diagnose problems with the root filesystem not being found. TODO add a sample panic error message for each error type:
@@ -12725,7 +12952,7 @@ CONFIG_VIRTIO_PCI=y
-
+
Pseudo filesystems are filesystems that don’t represent actual files in a hard disk, but rather allow us to do special operations on filesystem-related system calls.
@@ -12746,7 +12973,7 @@ CONFIG_VIRTIO_PCI=y
-
+
Debugfs is the simplest pseudo filesystem to play around with:
@@ -12810,7 +13037,7 @@ echo $?
-
+
Procfs is just another fops entry point:
@@ -12861,7 +13088,7 @@ echo $?
-
+
Its data is shared with uname(), which is a POSIX C function and has a Linux syscall to back it up.
@@ -12902,7 +13129,7 @@ echo $?
-
+
Sysfs is more restricted than procfs, as it does not take an arbitrary file_operations:
@@ -12982,7 +13209,7 @@ echo $?
-
+
And also destroy it on rmmod:
@@ -13114,9 +13341,9 @@ echo $?
-
+
-
+
File operations are the main method of userland driver communication.
@@ -13169,7 +13396,7 @@ echo $?
-
+
Writing trivial read File operations is repetitive and error prone. The seq_file API makes the process much easier for those trivial cases:
@@ -13230,7 +13457,7 @@ echo $?
-
+
If you have the entire read output upfront, single_open is an even more convenient version of seq_file:
@@ -13273,7 +13500,7 @@ cd
-
+
The poll system call allows an user process to do a non-busy wait on a kernel event.
@@ -13378,7 +13605,7 @@ POLLIN n=10 buf=4294893839
-
+
The ioctl system call is the best way to pass an arbitrary number of parameters to the kernel in a single go:
@@ -13475,7 +13702,7 @@ echo $?
-
+
The mmap system call allows us to share memory between user and kernel space without copying:
@@ -13545,7 +13772,7 @@ echo $?
-
+
Anonymous inodes allow getting multiple file descriptors from a single filesystem entry, which reduces namespace pollution compared to creating multiple device files:
@@ -13593,7 +13820,7 @@ echo $?
-
+
Netlink sockets offer a socket API for kernel / userland communication:
@@ -13658,7 +13885,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Kernel threads are managed exactly like userland threads; they also have a backing task_struct, and are scheduled with the same mechanism:
@@ -13696,7 +13923,7 @@ for i in `seq 16`; do ./netlink.out & done
Bibliography:
@@ -13712,7 +13939,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Let’s launch two threads and see if they actually run in parallel:
@@ -13755,7 +13982,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Count to dmesg every one second from 0 up to n - 1:
@@ -13785,7 +14012,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Count from 0 to 9 every second infinitely many times by scheduling a new work item from a work item:
@@ -13844,7 +14071,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Let’s block the entire kernel! Yay:
@@ -13891,7 +14118,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Wait queues are a way to make a thread sleep until an event happens on the queue:
@@ -13955,7 +14182,7 @@ for i in `seq 16`; do ./netlink.out & done
-
+
Count from 0 to 9 infinitely many times in 1 second intervals using timers:
@@ -13996,9 +14223,9 @@ for i in `seq 16`; do ./netlink.out & done
-
+
-
+
Brute force monitor every shared interrupt that will accept us:
@@ -14104,7 +14331,7 @@ request_irq irq = 1 ret = 0
-
+
The Linux kernel v4.16 mainline also has a dummy-irq module at drivers/misc/dummy-irq.c for monitoring a single IRQ.
@@ -14162,7 +14389,7 @@ request_irq irq = 1 ret = 0
-
+
@@ -14196,12 +14423,12 @@ request_irq irq = 1 ret = 0
-
+
-
+
Convert a string to an integer:
@@ -14237,7 +14464,7 @@ echo $?
-
+
Convert a virtual address to physical:
@@ -14307,7 +14534,7 @@ virt_to_phys(&static_var) = 0x40002308
-
+
@@ -14428,7 +14655,7 @@ pid 110
-
+
The xp QEMU monitor command reads memory at a given physical address.
@@ -14459,7 +14686,7 @@ pid 110
-
+
/dev/mem exposes access to physical addresses, and we use it through the convenient devmem BusyBox utility.
@@ -14535,7 +14762,7 @@ Value at address 0X7C7B800 (0x7ff7dbe01800): 0x12345678
-
+
Dump the physical address of all pages mapped to a given process using /proc/<pid>/maps and /proc/<pid>/pagemap.
@@ -14706,7 +14933,7 @@ pid 63
-
+
@@ -14724,7 +14951,7 @@ pid 63
I hope to have examples of all methods some day, since I’m obsessed with visibility.
-
+
@@ -14791,7 +15018,7 @@ a
-
+
0111ca406bdfa6fd65a2605d353583b4c4051781 was failing with:
@@ -14861,7 +15088,7 @@ make: *** [_all] Error 2
-
+
@@ -14987,7 +15214,7 @@ echo function_graph > current_tracer
-
+
kprobes is an instrumentation mechanism that injects arbitrary code at a given address in a trap instruction, much like GDB. Oh, the good old kernel. :-)
@@ -15052,7 +15279,7 @@ sleep 4 & sleep 4 &
-
+
TODO: didn’t port during refactor after 3b0a343647bed577586989fb702b760bd280844a. Reimplementing should not be hard.
@@ -15244,12 +15471,12 @@ instructions_firmware 20708
-
+
Make it harder to get hacked and easier to notice that you were, at the cost of some (small?) runtime overhead.
-
+
Detects buffer overflows for us:
@@ -15294,12 +15521,12 @@ detected buffer overflow in strlen
-
+
-
+
TODO get a hello world permission control working:
@@ -15390,13 +15617,13 @@ detected buffer overflow in strlen
-
+
@@ -15408,7 +15635,7 @@ detected buffer overflow in strlen
-
+
UIO is a kernel subsystem that allows to do certain types of driver operations from userland.
@@ -15516,9 +15743,9 @@ detected buffer overflow in strlen
-
+
-
+
@@ -15562,7 +15789,7 @@ detected buffer overflow in strlen
-
+
@@ -15599,7 +15826,7 @@ sendkey shift-pgdown
-
+
@@ -15633,7 +15860,7 @@ sendkey shift-pgdown
Here is a minimal example of Ctrl Alt Del:
@@ -15817,7 +16044,7 @@ static void halt_reboot_pwoff(int sig)
-
+
We cannot test these actual shortcuts on QEMU since the host captures them at a lower level, but from:
@@ -15888,7 +16115,7 @@ static void halt_reboot_pwoff(int sig)
-
+
In order to play with TTYs, do this:
@@ -16127,7 +16354,7 @@ tty63::respawn:-/bin/sh
-
+
@@ -16184,7 +16411,7 @@ tty63::respawn:-/bin/sh
-
+
Take the command described at TTY and try adding the following:
@@ -16220,7 +16447,7 @@ tty63::respawn:-/bin/sh
-
+
@@ -16264,7 +16491,7 @@ tty63::respawn:-/bin/sh
-
+
./build-buildroot --config-fragment buildroot_config/kmscube
@@ -16409,7 +16636,7 @@ failed to initialize legacy DRM
-
+
@@ -16430,7 +16657,7 @@ failed to initialize legacy DRM
-
+
@@ -16456,12 +16683,12 @@ wget \
-
+
-
+
@@ -16508,7 +16735,7 @@ wget \
-
+
POSIX userland stress. Two versions:
@@ -16521,7 +16748,7 @@ wget \
Websites:
@@ -16563,9 +16790,9 @@ ps
-
+
-
+
Between all archs on QEMU and gem5 we touch all of those kernel built output files.
@@ -16581,7 +16808,7 @@ ps
-
+
@@ -16599,9 +16826,9 @@ ps
-
+
-
+
The following kernel modules and Baremetal executables dump and disassemble various registers which cannot be observed from userland (usually "system registers", "control registers"):
@@ -16679,9 +16906,86 @@ ps
-
+
+
+
+
TODO minimal build + boot on QEMU example anywhere???
+
+
+
+
+
+
+
+
+
+
+
+
Zephyr is an RTOS that has POSIX support. I think it works much like our Baremetal setup which uses Newlib and generates individual ELF files that contain both our C program’s code, and the Zephyr libraries.
+
+
+
TODO get a hello world working, and then consider further integration in this repo, e.g. being able to run all C userland content on it.
+
+
+
TODO: Cortex-A CPUs are not currently supported, there are some qemu_cortex_m0 boards, but can’t find a QEMU Cortex-A. There is an x86_64 qemu board, but we don’t currently have an x86 baremetal toolchain. For this reason, we won’t touch this further for now.
+
+
+
However, unlike Newlib, Zephyr must be setting up a simple pre-main runtime to be able to handle threads.
+
+
+
+
+
# https://askubuntu.com/questions/952429/is-there-a-good-ppa-for-cmake-backports
+wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | sudo apt-key add -
+sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic-rc main'
+sudo apt-get update
+sudo apt-get install cmake
+git clone https://github.com/zephyrproject-rtos/zephyr
+pip3 install --user -U west packaging
+cd zephyr
+git checkout v1.14.1
+west init zephyrproject
+west update
+export ZEPHYR_TOOLCHAIN_VARIANT=xtools
+export XTOOLS_TOOLCHAIN_PATH="$(pwd)/out/crosstool-ng/build/default/install/aarch64/bin/"
+source zephyr-env.sh
+west build -b qemu_aarch64 samples/hello_world
+
+
+
+
The build system of that project is a bit excessive / wonky. You need an edge CMake not present in Ubuntu 18.04, which I don’t want to install right now, and it uses the weird custom west build tool frontend.
+
+
+
+
+
+
+
TODO minimal setup to run it on QEMU? Possible?
+
+
+
+
+
+
+
+
+
TODO: get prototype working and then properly integrate:
@@ -16756,7 +17060,7 @@ ps
-
+
+
+
-
+
QEMU is a system simulator: it simulates a CPU and devices such as interrupt handlers, timers, UART, screen, keyboard, etc.
@@ -16803,7 +17128,7 @@ ps
-
+
@@ -16812,7 +17137,7 @@ ps
-
+
We disable disk persistency for both QEMU and gem5 by default, to prevent the emulator from putting the image in an unknown state.
@@ -16867,7 +17192,7 @@ ps
Disk persistency is useful to re-run shell commands from the history of a previous session with Ctrl-R, but we felt that the loss of determinism was not worth it.
-
+
TODO how to make gem5 disk writes persistent?
@@ -16897,7 +17222,7 @@ index 17498c42b..76b8b351d 100644
-
+
@@ -16906,7 +17231,7 @@ index 17498c42b..76b8b351d 100644
-
+
Snapshots are stored inside the .qcow2 images themselves.
@@ -17053,7 +17378,7 @@ Format specific information:
-
+
@@ -17098,12 +17423,12 @@ Format specific information:
-
+
-
+
PCI driver for our minimal pci_min.c QEMU fork device:
@@ -17173,7 +17498,7 @@ lkmc_pci_min mmio_write addr = 4 val = 0 size = 4
-
+
Small upstream educational PCI device:
@@ -17243,7 +17568,7 @@ lkmc_pci_min mmio_write addr = 4 val = 0 size = 4
-
+
In this section we will try to interact with PCI devices directly from userland without kernel modules.
@@ -17389,7 +17714,7 @@ devmem 0xfeb54000 w 0x12345678
-
+
There are two versions of setpci and lspci:
@@ -17405,7 +17730,7 @@ devmem 0xfeb54000 w 0x12345678
-
+
@@ -17465,7 +17790,7 @@ devmem 0xfeb54000 w 0x12345678
-
+
lspci -k shows something like:
@@ -17519,7 +17844,7 @@ devmem 0xfeb54000 w 0x12345678
-
+
@@ -17561,7 +17886,7 @@ pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &edu->mmio);
-
+
TODO: broken. Was working before we moved arm from -M versatilepb to -M virt around af210a76711b7fa4554dcc2abd0ddacfc810dfd4. Either make it work on -M virt if that is possible, or document precisely how to make it work with versatilepb, or hopefully vexpress which is newer.
@@ -17604,7 +17929,7 @@ pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &edu->mmio);
-
+
TODO: broken when arm moved to -M virt, same as GPIO.
@@ -17676,7 +18001,7 @@ echo 255 >brightness
-
+
Minimal platform device example coded into the -M versatilepb SoC of our QEMU fork.
@@ -17754,7 +18079,7 @@ insmod platform_device.ko
-
+
@@ -17764,7 +18089,7 @@ insmod platform_device.ko
-
+
@@ -17884,7 +18209,7 @@ insmod platform_device.ko
-
+
@@ -17901,7 +18226,7 @@ insmod platform_device.ko
-
+
When doing GDB step debug it is possible to send QEMU monitor commands through the GDB monitor command, which saves you the trouble of opening yet another shell.
@@ -17917,7 +18242,7 @@ monitor info qtree
-
+
When you start hacking QEMU or gem5, it is useful to see what is going on inside the emulator themselves.
@@ -17977,10 +18302,10 @@ run
The build outputs are automatically stored in a different directories for optimized and debug builds, which prevents debug files from overwriting opt ones. Therefore, --gem5-build-id is not required.
When in QEMU text mode, using --debug-vm makes Ctrl-C not get passed to the QEMU guest anymore: it is instead captured by GDB itself, so allow breaking. So e.g. you won’t be able to easily quit from a guest program like:
@@ -17997,7 +18322,7 @@ run
You can still send key presses to QEMU however even without the mouse capture, just either click on the title bar, or alt tab to give it focus.
-
+
While step debugging any complex program, you always end up feeling the need to step in reverse to reach the last call to some function that was called before the failure point, in order to trace back the problem to the actual bug source.
@@ -18086,7 +18411,7 @@ reverse-next
-
+
Start pdb at the first instruction:
@@ -18120,7 +18445,7 @@ reverse-next
-
+
QEMU can log several different events.
@@ -18211,7 +18536,7 @@ Call Trace:
-
+
QEMU also has a second trace mechanism in addition to -trace, find out the events with:
@@ -18252,7 +18577,7 @@ IN:
-
+
@@ -18305,7 +18630,7 @@ of guest operations.
-
+
We can further use Binutils' addr2line to get the line that corresponds to each address:
@@ -18361,7 +18686,7 @@ less "$(./getvar --arch x86_64 run_dir)/trace-lines.txt"
-
+
QEMU runs, unlike gem5, are not deterministic by default, however it does support a record and replay mechanism that allows you to replay a previous run deterministically.
@@ -18468,7 +18793,7 @@ less "$(./getvar --arch x86_64 run_dir)/trace-lines.txt"
Solved on unmerged c42634d8e3428cfa60672c3ba89cabefc720cde9 from https://github.com/ispras/qemu/tree/rr-180725
-
+
@@ -18507,7 +18832,7 @@ reverse-continue
-
+
TODO: is there any way to distinguish which instruction runs on each core? Doing:
@@ -18522,13 +18847,13 @@ reverse-continue
-
+
@@ -18639,7 +18964,7 @@ less "$(./getvar --arch aarch64 run_dir)/trace-lines.txt"
TODO: 7452d399290c9c1fc6366cdad129ef442f323564 ./trace2line this is too slow and takes hours. QEMU’s processing of 170k events takes 7 seconds. gem5’s processing is analogous, but there are 140M events, so it should take 7000 seconds ~ 2 hours which seems consistent with what I observe, so maybe there is no way to speed this up… The workaround is to just use gem5’s ExecSymbol to get function granularity, and then GDB individually if line detail is needed?
-
+
gem5 traces are generated from DPRINTF(<trace-id> calls scattered throughout the code, except for ExecAll instruction traces, which uses Debug::ExecEnable directly..
@@ -18676,7 +19001,7 @@ extern SimpleFlag ExecEnable;
-
+
This debug flag traces all instructions.
@@ -18714,7 +19039,7 @@ extern SimpleFlag ExecEnable;
25007500: time count in some unit. Note how the microops execute at further timestamps.
-system.cpu: distinguishes between CPUs when there are more than one. For example, running Section 28.10.3, “ARM baremetal multicore” with two cores produces system.cpu0 and system.cpu1
+system.cpu: distinguishes between CPUs when there are more than one. For example, running Section 32.10.3, “ARM baremetal multicore” with two cores produces system.cpu0 and system.cpu1
T0: thread number. TODO: hyperthread? How to play with it?
@@ -18759,7 +19084,7 @@ extern SimpleFlag ExecEnable;
-
+
@@ -18814,13 +19139,13 @@ add x1, x0, 2
-
+
As of gem5 16eeee5356585441a49d05c78abc328ef09f7ace the default tracer is ExeTracer. It is set at:
@@ -18893,7 +19218,7 @@ src/arch/x86/nativetrace.hh:41:class X86NativeTrace : public NativeTrace
-
+
Sometimes in Ubuntu 14.04, after the QEMU SDL GUI starts, it does not get updated after keyboard strokes, and there are artifacts like disappearing text.
@@ -18917,7 +19242,7 @@ root
-
+
-
+
-
@@ -19007,7 +19332,7 @@ root
runs are deterministic by default, unlike QEMU which has a special QEMU record and replay mode, that requires first playing the content once and then replaying
-
-
gem5 ARM at least appears to implement more low level CPU functionality than QEMU, e.g. QEMU only added EL2 in 2018: https://stackoverflow.com/questions/42824706/qemu-system-aarch64-entering-el1-when-emulating-a53-power-up See also: Section 28.10.1, “ARM exception levels”
+gem5 ARM at least appears to implement more low level CPU functionality than QEMU, e.g. QEMU only added EL2 in 2018: https://stackoverflow.com/questions/42824706/qemu-system-aarch64-entering-el1-when-emulating-a53-power-up See also: Section 32.10.1, “ARM exception levels”
-
gem5 offers more advanced logging, even for non micro architectural things which QEMU models in some way, e.g. QEMU trace memory accesses, because QEMU’s binary translation optimizations reduce visibility
@@ -19020,7 +19345,7 @@ root
-
+
OK, this is why we used gem5 in the first place, performance measurements!
@@ -19211,7 +19536,7 @@ cat out/gem5-bench-dhrystone.txt
Now you can play a fun little game with your friends:
@@ -19246,7 +19571,7 @@ cat out/gem5-bench-dhrystone.txt
To find out why your program is slow, a good first step is to have a look at the gem5 m5out/stats.txt file.
-
+
@@ -19282,7 +19607,7 @@ cat out/gem5-bench-dhrystone.txt
-
+
Besides optimizing a program for a given CPU setup, chip developers can also do the inverse, and optimize the chip for a given benchmark!
@@ -19290,7 +19615,7 @@ cat out/gem5-bench-dhrystone.txt
The rabbit hole is likely deep, but let’s scratch a bit of the surface.
-
+
./run --arch arm --cpus 2 --emulator gem5
@@ -19337,7 +19662,7 @@ getconf _NPROCESSORS_CONF
-
+
User mode simulation QEMU v4.0.0 always shows the number of cores of the host, presumably because the thread switching uses host threads directly which would make that harder to implement.
@@ -19374,7 +19699,7 @@ ps Haux | grep qemu | wc
-
+
@@ -19416,7 +19741,7 @@ ps Haux | grep qemu | wc
-
+
@@ -19603,12 +19928,12 @@ instructions 91738770
-
+
-
+
TODO These look promising:
@@ -19649,7 +19974,7 @@ instructions 91738770
we have no caches, each instruction is fetched from memory
-each loop contains 11 instructions as shown at Section 32.2, “C busy loop”
+each loop contains 11 instructions as shown at Section 35.2, “C busy loop”
and supposing that the loop dominated executable pre/post main, which we know is true since as shown in Benchmark emulators on userland executables an empty dynamically linked C program only as about 100k instructions, while our loop runs 1000000 * 11 = 12M.
@@ -19677,7 +20002,7 @@ instructions 91738770
-
+
Can be set across emulators with:
@@ -19781,7 +20106,7 @@ get_avphys_pages() * sysconf(_SC_PAGESIZE) = 0x1D178000
-
+
@@ -19839,7 +20164,7 @@ get_avphys_pages() * sysconf(_SC_PAGESIZE) = 0x1D178000
-
+
TODO These look promising:
@@ -19854,7 +20179,7 @@ get_avphys_pages() * sysconf(_SC_PAGESIZE) = 0x1D178000
-
+
As of gem5 872cb227fdc0b4d60acc7840889d567a6936b6e1 defaults to 2GHz for fs.py:
@@ -19939,7 +20264,7 @@ hello
-
+
@@ -19972,9 +20297,9 @@ hello
-
+
-
+
Analogous to QEMU, on the first shell:
@@ -20007,7 +20332,7 @@ hello
-
+
@@ -20042,7 +20367,7 @@ hello
-
+
gem5’s secondary core GDB setup is a hack and spawns one gdbserver for each core in separate ports, e.g. 7000, 7001, etc.
@@ -20063,7 +20388,7 @@ hello
-
+
Analogous to QEMU’s Snapshot, but better since it can be started from inside the guest, so we can easily checkpoint after a specific guest event, e.g. just before init is done.
@@ -20151,7 +20476,7 @@ m5 checkpoint
since boot has already happened, and the parameters are already in the RAM of the snapshot.
-
+
@@ -20205,7 +20530,7 @@ Exiting @ tick 84500 because m5_exit instruction encountered
-
+
@@ -20256,7 +20581,7 @@ prvEvalTick=0
-
+
You want to automate running several tests from a single pristine post-boot state.
@@ -20404,7 +20729,7 @@ expect eof
-
+
gem5 can switch to a different CPU model when restoring a checkpoint.
@@ -20519,7 +20844,7 @@ cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
-
+
Besides switching CPUs after a checkpoint restore, fs.py also has the --fast-forward option to automatically run the script from the start on a less detailed CPU, and switch to a more detailed CPU at a given tick.
@@ -20645,7 +20970,7 @@ FullO3CPU: Ticking main, FullO3CPU.
-
+
The in-tree util/cpt_upgrader.py is a tool to upgrade checkpoints taken from an older version of gem5 to be compatible with the newest version, so you can update gem5 without having to re-run the simulation that generated the checkpoints.
@@ -20681,7 +21006,7 @@ version_tags=arm-ccregs arm-contextidr-el2 arm-gem5-gic-ext ...
-
+
Remember that in the gem5 command line, we can either pass options to the script being run as in:
@@ -20738,7 +21063,7 @@ version_tags=arm-ccregs arm-contextidr-el2 arm-gem5-gic-ext ...
-
+
m5ops are magic instructions which lead gem5 to do magic things, like quitting or dumping stats.
@@ -20778,7 +21103,7 @@ version_tags=arm-ccregs arm-contextidr-el2 arm-gem5-gic-ext ...
-
+
m5 is a guest command line utility that is installed and run on the guest, that serves as a CLI front-end for the m5ops
@@ -20808,7 +21133,7 @@ version_tags=arm-ccregs arm-contextidr-el2 arm-gem5-gic-ext ...
This can be a good test m5ops since it executes very quickly.
-
+
@@ -20817,13 +21142,13 @@ version_tags=arm-ccregs arm-contextidr-el2 arm-gem5-gic-ext ...
-
+
End the simulation with a failure exit event:
@@ -20862,7 +21187,7 @@ version_tags=arm-ccregs arm-contextidr-el2 arm-gem5-gic-ext ...
-
+
Send a guest file to the host. 9P is a more advanced alternative.
@@ -20893,7 +21218,7 @@ m5 writefile myfileguest myfilehost
-
+
Read a host file pointed to by the fs.py --script option to stdout.
@@ -20921,7 +21246,7 @@ m5 writefile myfileguest myfilehost
-
+
Ermm, just another m5 readfile that only takes integers and only from CLI options? Is this software so redundant?
@@ -20947,7 +21272,7 @@ m5 writefile myfileguest myfilehost
-
+
Trivial combination of m5 readfile + execute the script.
@@ -20982,7 +21307,7 @@ m5 execfile
-
+
gem5 allocates some magic instructions on unused instruction encodings for convenient guest instrumentation.
@@ -21075,7 +21400,7 @@ m5 execfile
-
+
@@ -21189,7 +21514,7 @@ m5_fail(ints[1], ints[0]);
-
+
include/gem5/asm/generic/m5ops.h also describes some annotation instructions.
@@ -21200,7 +21525,7 @@ m5_fail(ints[1], ints[0]);
@@ -21288,7 +21613,7 @@ git -C "$(./getvar linux_source_dir)" checkout -
Tested on 649d06d6758cefd080d04dc47fd6a5a26a620874 + 1.
-
+
We have observed that with the kernel patches, boot is 2x faster, falling from 1m40s to 50s.
@@ -21306,7 +21631,7 @@ git -C "$(./getvar linux_source_dir)" checkout -
-
+
When you run gem5, it generates an m5out directory at:
@@ -21322,7 +21647,7 @@ git -C "$(./getvar linux_source_dir)" checkout -
The files in that directory contains some very important information about the run, and you should become familiar with every one of them.
-
+
Contains UART output, both from the Linux kernel or from the baremetal system.
@@ -21331,7 +21656,7 @@ git -C "$(./getvar linux_source_dir)" checkout -
-
+
This file used to be called just m5out/system.dmesg, but the name was changed after the workload refactorings of March 2020.
@@ -21405,7 +21730,7 @@ index f296d89be757..3e79916322c2 100644
-
+
This file contains important statistics about the run:
@@ -21504,7 +21829,7 @@ system.cpu.dtb.inst_hits
and after that the file size went down to 21KB.
-
+
We can make gem5 dump statistics in the HDF5 format by adding the magic h5:// prefix to the file name as in:
@@ -21554,7 +21879,7 @@ system.cpu.dtb.inst_hits
-
+
@@ -21566,7 +21891,7 @@ system.cpu.dtb.inst_hits
-
+
Well, run minimal examples, and reverse engineer them up!
@@ -21624,7 +21949,7 @@ sim_ops 6 # Number of ops (including micro ops) simulated
-
+
@@ -21698,7 +22023,7 @@ Text::end()
-
+
The m5out/config.ini file, contains a very good high level description of the system:
@@ -21771,7 +22096,7 @@ clock=500
Modifying the config.ini file manually does nothing since it gets overwritten every time.
-
+
The m5out/config.dot file contains a graphviz .dot file that provides a simplified graphical view of a subset of the gem5 config.ini.
@@ -21852,7 +22177,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
-
+
We use the m5term in-tree executable to connect to the terminal instead of a direct telnet.
@@ -21877,7 +22202,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
-
+
We have made a crazy setup that allows you to just cd into submodules/gem5, and edit Python scripts directly there.
@@ -21911,7 +22236,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
-
+
By default, we use configs/example/fs.py script.
@@ -21960,7 +22285,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
-
+
@@ -21971,7 +22296,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
But can the people from the project be convinced of that?
-
+
These are just very small GTest tests that test a single class in isolation, they don’t run any executables.
@@ -22026,7 +22351,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
-
+
This section is about running the gem5 in-tree tests.
@@ -22075,7 +22400,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"
-
+
This error happens when the following instruction limits are reached:
@@ -22211,24 +22536,24 @@ Exiting @ tick 18446744073709551615 because simulate() limit reached
-
+
In order to use different build options, you might also want to use gem5 build variants to keep the build outputs separate from one another.
-
+
If you build gem5 with scons build/ARM/gem5.debug, then that is a .debug build.
-
+
./build-gem5 --gem5-build-type fast
@@ -22246,13 +22571,13 @@ Exiting @ tick 18446744073709551615 because simulate() limit reached
-
+
Profiling builds as of 3cea7d9ce49bda49c50e756339ff1287fd55df77 both use: -g -O3 and disable asserts and logging like the gem5 fast build and:
@@ -22280,7 +22605,7 @@ gprof "$(./getvar --arch aarch64 gem5_executable)" > tmp.gprof
-
+
TODO test properly, benchmark vs GCC.
@@ -22293,7 +22618,7 @@ gprof "$(./getvar --arch aarch64 gem5_executable)" > tmp.gprof
-
+
If there gem5 appears to have a C++ undefined behaviour bug, which is often very difficult to track down, you can try to build it with the following extra SCons options:
@@ -22367,7 +22692,7 @@ Indirect leak of 1346 byte(s) in 2 object(s) allocated from:
-
+
gem5 has two types of memory system:
@@ -22503,7 +22828,7 @@ cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
Tested in gem5 d7d9bc240615625141cd6feddbadd392457e49eb.
-
+
This is the simplest of all protocols, and therefore the first one you should study to learn how Ruby works.
@@ -22539,7 +22864,7 @@ cat "$(./getvar --arch aarch64 --emulator gem5 trace_txt_file)"
-
+
@@ -22590,7 +22915,7 @@ class SystemXBar(CoherentXBar):
-
+
Python 3 support was mostly added in 2019 Q3 at arounda347a1a68b8a6e370334be3a1d2d66675891e0f1 but remained buggy for some time afterwards.
@@ -22608,7 +22933,7 @@ class SystemXBar(CoherentXBar):
-
+
gem5 has a few in tree CPU models for different purposes.
@@ -22691,9 +23016,9 @@ class SystemXBar(CoherentXBar):
From this we see that there are basically only 4 C++ CPU models in gem5: Atomic, Timing, Minor and O3. All others are basically parametrizations of those base types.
-
+
-
+
Simple abstract CPU without a pipeline.
@@ -22714,7 +23039,7 @@ class SystemXBar(CoherentXBar):
-
+
AtomicSimpleCPU: the default one. Memory accesses happen instantaneously. The fastest simulation except for KVM, but not realistic at all.
@@ -22723,7 +23048,7 @@ class SystemXBar(CoherentXBar):
-
+
TimingSimpleCPU: memory accesses are realistic, but the CPU has no pipeline. The simulation is faster than detailed models, but slower than AtomicSimpleCPU.
@@ -22739,7 +23064,7 @@ class SystemXBar(CoherentXBar):
-
+
@@ -22805,7 +23130,7 @@ class SystemXBar(CoherentXBar):
-
+
@@ -22865,7 +23190,7 @@ wbWidth=8
-
+
-
@@ -22884,7 +23209,7 @@ wbWidth=8
-
+
@@ -22918,7 +23243,7 @@ less o3pipeview.tmp.log
-
+
@@ -22938,7 +23263,7 @@ less o3pipeview.tmp.log
-
+
@@ -22948,7 +23273,7 @@ less o3pipeview.tmp.log
-
+
The gem5 platform is selectable with the --machine option, which is named after the analogous QEMU -machine option, and which sets the --machine-type.
@@ -22976,7 +23301,7 @@ less o3pipeview.tmp.log
-
+
@@ -23030,7 +23355,7 @@ cd ..
-
+
Certain ISAs like ARM have bootloaders that are automatically run before the main image to setup basic system state.
@@ -23063,12 +23388,12 @@ cd ..
-
+
-
+
The gem5 memory system is connected in a very flexible way through the port system.
@@ -23079,7 +23404,7 @@ cd ..
A Packet is the basic information unit that gets sent across ports.
-
+
gem5 memory requests can be classified in the following broad categories:
@@ -23289,7 +23614,7 @@ TimingSimpleCPU::finishTranslation(WholeTranslationState *state)
Tested in gem5 b1623cb2087873f64197e503ab8894b5e4d4c7b4.
-
+
@@ -23336,9 +23661,9 @@ TimingSimpleCPU::finishTranslation(WholeTranslationState *state)
-
+
-
+
Packet is what goes through ports: a single packet is sent out to the memory system, gets modified when it hits valid data, and then returns with the reply.
@@ -23421,7 +23746,7 @@ Addr addr;
-
+
@@ -23507,7 +23832,7 @@ MemCmd::commandInfo[] =
-
+
One good way to think about Request vs Packet could be "it is what the instruction definitions see", a bit like ExecContext vs ThreadContext.
@@ -23574,7 +23899,7 @@ Addr _vaddr = MaxAddr;
-
+
In AtomicSimpleCPU, a single packet of each type is kept for the entire CPU, e.g.:
@@ -23635,7 +23960,7 @@ TLB::translateMmuOn(ThreadContext* tc, const RequestPtr &req, Mode mode,
-
+
@@ -23670,7 +23995,7 @@ TimingSimpleCPU::initiateMemRead(Addr addr, unsigned size,
-
+
@@ -23732,7 +24057,7 @@ TimingSimpleCPU::initiateMemRead(Addr addr, unsigned size,
-
+
You can place this SimObject in between two ports to get extra statistics about the packets that are going through.
@@ -23777,9 +24102,32 @@ TimingSimpleCPU::initiateMemRead(Addr addr, unsigned size,
One neat thing about this is that it is agnostic to the memory object type, so you don’t have to recode those statistics for every new type of object that operates on memory packets.
+
+
+
+
SimpleMemory is a highly simplified memory system. It can replace a more complex DRAM model if you use it e.g. as:
+
+
+
+
./run --emulator gem5 -- --mem-type SimpleMemory
+
+
+
+
and it also gets used in certain system-y memories present in ARM systems by default e.g. Flash memory:
+
+
+
+
[system.realview.flash0]
+type=SimpleMemory
+
+
+
+
As of gem5 3ca404da175a66e0b958165ad75eb5f54cb5e772 LKMC 059a7ef9d9c378a6d1d327ae97d90b78183680b2 it did not provide any speedup to the Linux kernel boot according to a quick test.
+
+
-
+
Internals under other sections:
@@ -23800,7 +24148,7 @@ TimingSimpleCPU::initiateMemRead(Addr addr, unsigned size,
-
+
@@ -23862,7 +24210,7 @@ TimingSimpleCPU::initiateMemRead(Addr addr, unsigned size,
-
+
@@ -24047,7 +24395,7 @@ static EmbeddedPyBind embed_obj("BadDevice", module_init, "BasicPioDevice");
-
+
The main is at: src/sim/main.cc. It calls:
@@ -24135,7 +24483,7 @@ exec filecode in scope
Tested at gem5 b4879ae5b0b6644e6836b0881e4da05c64a6550d.
-
+
All SimObjects seem to be automatically added to the m5.objects namespace, and this is done in a very convoluted way, let’s try to understand a bit:
@@ -24300,7 +24648,7 @@ for source in PySource.all:
-
+
gem5 is an event based simulator, and as such the event queue is of of the crucial elements in the system.
@@ -24406,7 +24754,7 @@ b EventFunctionWrapper::process
Then, once we had that, the most perfect thing ever would be to make the full event graph containing which events schedule which events!
-
+
@@ -24542,7 +24890,7 @@ AtomicSimpleCPU::tick() at atomic.cc:757 0x55555907834c
Tested in gem5 12c917de54145d2d50260035ba7fa614e25317a3.
-
+
Let’s have a closer look at the initial magically scheduled events of the simulation.
@@ -24761,7 +25109,7 @@ simulate() at simulate.cc:104 0x555559476d6f
-
+
Inside AtomicSimpleCPU::tick() we saw previously that the reschedule happens at:
@@ -24801,7 +25149,7 @@ clock=500
-
+
It will be interesting to see how AtomicSimpleCPU makes memory access on GDB and to compare that with TimingSimpleCPU.
@@ -24855,7 +25203,7 @@ clock=500
-
+
Happens on EmulationPageTable, and seems to happen atomically without making any extra memory requests.
@@ -24926,7 +25274,7 @@ Exiting @ tick 3500 because exiting with last active thread context
-
+
Now, let’s move on to TimingSimpleCPU, which is just like AtomicSimpleCPU internally, but now the memory requests don’t actually finish immediately: gem5 CPU types!
@@ -25207,7 +25555,7 @@ info: Entering event queue @ 0. Starting simulation...
-
+
Schedules TimingSimpleCPU::fetch through:
@@ -25252,7 +25600,7 @@ ArmLinuxProcess64::initState
-
+
@@ -25383,7 +25731,7 @@ DRAMCtrl::Rank::startup(Tick ref_tick)
-
+
@@ -25416,13 +25764,13 @@ DRAMCtrl::Rank::startup(Tick ref_tick)
-
+
From the timing we know what that one is: the end of time exit event, like for AtomicSimpleCPU.
-
+
Executes TimingSimpleCPU::fetch().
@@ -25530,7 +25878,7 @@ DRAMCtrl::Rank::startup(Tick ref_tick)
-
+
Schedules DRAMCtrl::processNextReqEvent through:
@@ -25667,7 +26015,7 @@ TimingSimpleCPU::fetch
-
+
Schedules BaseXBar::Layer::releaseLayer through:
@@ -25693,13 +26041,13 @@ TimingSimpleCPU::fetch
-
+
Executes DRAMCtrl::processNextReqEvent.
-
+
Schedules DRAMCtrl::Rank::processActivateEvent through:
@@ -25713,7 +26061,7 @@ DRAMCtrl::processNextReqEvent
-
+
Schedules DRAMCtrl::processRespondEvent through:
@@ -25725,7 +26073,7 @@ DRAMCtrl::processNextReqEvent
-
+
Schedules DRAMCtrl::processNextReqEvent through:
@@ -25737,7 +26085,7 @@ DRAMCtrl::processNextReqEvent
-
+
Executes DRAMCtrl::Rank::processActivateEvent.
@@ -25746,7 +26094,7 @@ DRAMCtrl::processNextReqEvent
-
+
Schedules DRAMCtrl::Rank::processPowerEvent through:
@@ -25759,7 +26107,7 @@ DRAMCtrl::Rank::processActivateEvent
-
+
Executes DRAMCtrl::Rank::processPowerEvent.
@@ -25768,25 +26116,25 @@ DRAMCtrl::Rank::processActivateEvent
-
+
Executes BaseXBar::Layer<SrcType, DstType>::releaseLayer.
-
+
Executes DRAMCtrl::processNextReqEvent().
-
+
Executes DRAMCtrl::processRespondEvent().
-
+
Schedules PacketQueue::processSendEvent() through:
@@ -25801,13 +26149,13 @@ DRAMCtrl::processRespondEvent
-
+
Executes PacketQueue::processSendEvent().
-
+
Schedules PacketQueue::processSendEvent through:
@@ -25831,7 +26179,7 @@ PacketQueue::processSendEvent
-
+
Schedules BaseXBar::Layer<SrcType, DstType>::releaseLayer through:
@@ -25851,19 +26199,19 @@ PacketQueue::processSendEvent
-
+
Executes BaseXBar::Layer<SrcType, DstType>::releaseLayer.
-
+
Executes PacketQueue::processSendEvent.
-
+
Schedules TimingSimpleCPU::IcachePort::ITickEvent::process() through:
@@ -25881,7 +26229,7 @@ PacketQueue::processSendEvent
-
+
Executes TimingSimpleCPU::IcachePort::ITickEvent::process().
@@ -25901,7 +26249,7 @@ PacketQueue::processSendEvent
-
+
Schedules DRAMCtrl::processNextReqEvent through:
@@ -25930,7 +26278,7 @@ TimingSimpleCPU::IcachePort::ITickEvent::process
-
+
Schedules BaseXBar::Layer<SrcType, DstType>::releaseLayer through:
@@ -25956,19 +26304,19 @@ TimingSimpleCPU::IcachePort::ITickEvent::process
-
+
Execute DRAMCtrl::processNextReqEvent.
-
+
Schedule DRAMCtrl::processRespondEvent().
-
+
One important thing we want to check now, is how the memory reads are going to make the processor stall in the middle of an instruction.
@@ -26086,7 +26434,7 @@ TimingSimpleCPU::IcachePort::ITickEvent::process
-
+
@@ -26402,7 +26750,7 @@ type=SetAssociative
At 1000, the future event is executed, and so it reads the original packet from the MSHR, and uses that to create a new request [40:7f] which gets forwarded.
-
+
It would be amazing to analyze a simple example with interconnect packets possibly invalidating caches of other CPUs.
@@ -26625,7 +26973,7 @@ type=SetAssociative
-
+
@@ -26950,7 +27298,7 @@ global 147
-
+
@@ -26992,7 +27340,7 @@ non-atomic 19
-
+
@@ -27162,14 +27510,14 @@ non-atomic 19
-
+
@@ -27197,7 +27545,7 @@ non-atomic 19
This section and children are tested at LKMC 144a552cf926ea630ef9eadbb22b79fe2468c456.
-
+
@@ -27436,7 +27784,7 @@ non-atomic 19
-
+
@@ -27480,7 +27828,7 @@ non-atomic 19
-
+
@@ -27521,7 +27869,7 @@ non-atomic 19
-
+
@@ -27572,7 +27920,7 @@ non-atomic 19
-
+
@@ -27659,7 +28007,7 @@ non-atomic 19
-
+
@@ -27726,7 +28074,7 @@ non-atomic 19
-
+
@@ -27915,7 +28263,7 @@ wbActual:0
-
+
This is one of the parts of gem5 that rely on semi-useless code generation inside the .isa sublanguage.
@@ -27958,7 +28306,7 @@ wbActual:0
The file is an include so that compilation can be split up into chunks by the autogenerated includers
@@ -28163,7 +28511,7 @@ namespace ArmISAInst {
Tested in gem5 b1623cb2087873f64197e503ab8894b5e4d4c7b4.
-
+
completeAcc is boring on most simple store memory instructions, e.g. a simple STR:
@@ -28270,7 +28618,7 @@ namespace ArmISAInst {
-
+
Some gem5 instructions break down into multiple microops.
@@ -28331,7 +28679,7 @@ namespace ArmISAInst {
-
+
These classes get used everywhere, and they have a somewhat convoluted relation with one another, so let’s figure it out this mess.
@@ -28342,7 +28690,7 @@ namespace ArmISAInst {
This section and all children tested at gem5 b1623cb2087873f64197e503ab8894b5e4d4c7b4.
-
+
As we delve into more details below, we will reach the following conclusion: a ThreadContext represents on thread of a CPU with multiple Hardware threads.
@@ -28392,7 +28740,7 @@ typedef SimpleThread MinorThread;
Essentially all methods of the base ThreadContext are pure virtual.
-
+
SimpleThread storage defined on BaseSimpleCPU for simple CPUs like AtomicSimpleCPU:
@@ -28487,7 +28835,7 @@ typedef SimpleThread MinorThread;
-
+
Instantiation happens in the FullO3CPU constructor:
@@ -28588,7 +28936,7 @@ FullO3CPU<Impl>::readArchIntReg(int reg_idx, ThreadID tid)
-
+
Owned one per ThreadContext.
@@ -28634,7 +28982,7 @@ class O3ThreadContext : public ThreadContext
-
+
@@ -28794,7 +29142,7 @@ class O3ThreadContext : public ThreadContext
This makes sense, since each ThreadContext represents one CPU register set, and therefore needs a separate ExecContext which allows instruction implementations to access those registers.
-
+
Let’s have a look at how ExecContext::readIntRegOperand actually matches registers to decoded registers IDs, since it is not obvious.
@@ -28833,7 +29181,7 @@ class O3ThreadContext : public ThreadContext
First, we guess that they must be related to the reading of x1 and x2, which are the inputs of the addition.
Let’s also have a look at the decoder code that builds the instruction instance in build/ARM/arch/arm/generated/decoder-ns.cc.inc:
@@ -29067,7 +29415,7 @@ flattenIntIndex(int reg) const
-
+
The Process class is used only for gem5 syscall emulation mode, and it represents a process like a Linux userland process, in addition to any further gem5 specific data needed to represent the process.
@@ -29155,12 +29503,12 @@ readFunc(SyscallDesc *desc, ThreadContext *tc,
-
+
Each instruction is marked with a class, and each class can execute in a given functional unit.
-
+
@@ -29319,7 +29667,7 @@ opClass=IntAlu
-
+
On gem5 3ca404da175a66e0b958165ad75eb5f54cb5e772, after running:
@@ -29417,7 +29765,7 @@ pipelined=false
-
+
gem5 uses a ton of code generation, which makes the project horrendous:
@@ -29462,7 +29810,7 @@ pipelined=false
But it has been widely overused to insanity. It likely also exists partly because when the project started in 2003 C++ compilers weren’t that good, so you couldn’t rely on features like templates that much.
-
+
Generated code at: build/<ISA>/config/the_isa.hh which e.g. for ARM contains:
@@ -29508,9 +29856,9 @@ enum class Arch {
-
+
-
+
@@ -29525,7 +29873,7 @@ enum class Arch {
-
+
gem5 moves a bit slowly, and if your host compiler is very new, the gem5 build might be broken for it, e.g. this was the case for Ubuntu 19.10 with GCC 9 and gem5 62d75e7105fe172eb906d4f80f360ff8591d4178 from Dec 2019.
@@ -29550,7 +29898,7 @@ enum class Arch {
-
+
E.g. src/cpu/decode_cache.hh includes:
@@ -29629,7 +29977,7 @@ build/ARM/config/the_isa.hh
-
+
@@ -29670,7 +30018,7 @@ build/ARM/config/the_isa.hh
-
+
-
+
-
+
Buildroot is a set of Make scripts that download and compile from source compatible versions of:
@@ -29790,7 +30138,7 @@ gensim/models/armv8/isa.ac
Linux kernel
-C standard library: Buildroot supports several implementations, see: Section 21.10, “libc choice”
+C standard library: Buildroot supports several implementations, see: Section 25.10, “libc choice”
BusyBox: provides the shell and basic command line utilities
@@ -29801,7 +30149,7 @@ gensim/models/armv8/isa.ac
It therefore produces a pristine, blob-less, debuggable setup, where all moving parts are configured to work perfectly together.
The downsides of Buildroot are:
@@ -29846,7 +30194,7 @@ gensim/models/armv8/isa.ac
-
+
We provide the following mechanisms:
@@ -29881,10 +30229,10 @@ gensim/models/armv8/isa.ac
The clean is necessary because the source files didn’t change, so make would just check the timestamps and not build anything.
-
+
If you are benchmarking compiled programs instead of hand written assembly, remember that we configure Buildroot to disable optimizations by default with:
@@ -29916,7 +30264,7 @@ gensim/models/armv8/isa.ac
-
-
if you already have a full -O0 build, you can choose to rebuild just your package of interest to save some time as described at: Section 21.2, “Custom Buildroot configs”
+if you already have a full -O0 build, you can choose to rebuild just your package of interest to save some time as described at: Section 25.2, “Custom Buildroot configs”
./build-buildroot \
@@ -29952,7 +30300,7 @@ gensim/models/armv8/isa.ac
-
+
make menuconfig is a convenient way to find Buildroot configurations:
@@ -29978,7 +30326,7 @@ make menuconfig
-
+
At startup, we login automatically as the root user.
@@ -30015,7 +30363,7 @@ make menuconfig
-
+
@@ -30038,7 +30386,7 @@ make menuconfig
-
+
@@ -30102,7 +30450,7 @@ make menuconfig
-
+
First, see if you can’t get away without actually adding a new package, for example:
@@ -30112,7 +30460,7 @@ make menuconfig
if you have a standalone C file with no dependencies besides the C standard library to be compiled with GCC, just add a new file under buildroot_packages/sample_package and you are done
-if you have a dependency on a library, first check if Buildroot doesn’t have a package for it already with ls buildroot/package. If yes, just enable that package as explained at: Section 21.2, “Custom Buildroot configs”
+if you have a dependency on a library, first check if Buildroot doesn’t have a package for it already with ls buildroot/package. If yes, just enable that package as explained at: Section 25.2, “Custom Buildroot configs”
@@ -30120,7 +30468,7 @@ make menuconfig
If none of those methods are flexible enough for you, you can just fork or hack up buildroot_packages/sample_package the sample package to do what you want.
@@ -30195,7 +30543,7 @@ TODO benchmark: would gem5 suffer a considerable disk read performance hit due t
Bibliography: https://stackoverflow.com/questions/49211241/is-there-a-way-to-automatically-detect-the-minimum-required-br2-target-rootfs-ex
-
+
SquashFS creation with mksquashfs does not take fixed sizes, and I have successfully booted from it, but it is readonly, which is unacceptable.
@@ -30208,7 +30556,7 @@ TODO benchmark: would gem5 suffer a considerable disk read performance hit due t
-
+
Buildroot is not designed for large root filesystem images, and the rebuild becomes very slow when we add a large package to it.
@@ -30246,7 +30594,7 @@ TODO benchmark: would gem5 suffer a considerable disk read performance hit due t
-
+
When asking for help on upstream repositories outside of this repository, you will need to provide the commands that you are running in detail without referencing our scripts.
@@ -30306,7 +30654,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -
Then, you will also want to do a Bisection to pinpoint the exact commit to blame, and CC that developer.
For Buildroot problems, you should wither provide the config you have:
@@ -30321,7 +30669,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -
-
+
Buildroot supports several libc implementations, including:
@@ -30369,7 +30717,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -
-
+
This repo doesn’t do much more other than setting a bunch of Buildroot configurations and building it.
@@ -30414,7 +30762,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -
-
+
Users of this repo will often want to update the compilation toolchain to the latest version to get fresh new features like new ISA instructions.
@@ -30428,7 +30776,7 @@ git -C "$(./getvar qemu_source_dir)" checkout -
In this section we cover the most common cases.
-
+
This is of course the simplest case.
@@ -30546,9 +30894,9 @@ cd ../..
-
+
@@ -30606,7 +30954,7 @@ cd ../..
-
+
@@ -30638,7 +30986,7 @@ cd ../..
-
+
-
Userland assembly content is located at: Section 23, “Userland assembly”. It was split from this section basically because we were hitting the HTML h6 limit, stupid web :-)
+
Userland assembly content is located at: Section 27, “Userland assembly”. It was split from this section basically because we were hitting the HTML h6 limit, stupid web :-)
-
+
@@ -30716,7 +31064,7 @@ cd ../..
-
+
@@ -30855,7 +31203,7 @@ cd ../..
-
+
@@ -30869,7 +31217,7 @@ cd ../..
malloc leads to the infinite joys of Memory leaks.
-
+
TODO: the exact answer is going to be hard.
@@ -30914,7 +31262,7 @@ printf '%x\n' 4198400
-
+
@@ -30980,7 +31328,7 @@ echo 1 > /proc/sys/vm/overcommit_memory
If we start using the pages, the OOM killer would sooner or later step in and kill our process: Linux out-of-memory killer.
-
+
We can observe the OOM in LKMC 1e969e832f66cb5a72d12d57c53fb09e9721d589 which defaults to 256MiB of memory with:
@@ -31006,7 +31354,7 @@ echo 1 > /proc/sys/vm/overcommit_memory
-
+
@@ -31024,7 +31372,7 @@ echo 1 > /proc/sys/vm/overcommit_memory
-
+
-
@@ -31116,9 +31464,9 @@ echo 1 > /proc/sys/vm/overcommit_memory
-
+
-
+
@@ -31130,7 +31478,7 @@ echo 1 > /proc/sys/vm/overcommit_memory
-
+
@@ -31153,7 +31501,7 @@ echo 1 > /proc/sys/vm/overcommit_memory
strace shows that OpenMP makes clone() syscalls in Linux. TODO: does it actually call pthread_ functions, or does it make syscalls directly? Or in other words, can it work on Freestanding programs? A quick grep shows many references to pthreads.
-
+
@@ -31251,7 +31599,7 @@ mkdir -p bin/c
-
+
@@ -31361,21 +31709,27 @@ mkdir -p bin/c
+
+Algorithms contains a benchmark comparison of different c++ containers
+