mirror of
https://github.com/cirosantilli/linux-kernel-module-cheat.git
synced 2026-01-23 02:05:57 +01:00
Move gem5 doc down since much of it is diffed with qemu
This commit is contained in:
546
README.adoc
546
README.adoc
@@ -1197,6 +1197,279 @@ Call Trace:
|
||||
in which the boot appears to hang for a considerable time.
|
||||
* Confirm that the kernel enters at `0x1000000`, or where it enters. Once we have this, we can exclude what comes before in the BIOS.
|
||||
|
||||
=== initrd
|
||||
|
||||
The kernel can boot from an CPIO file, which is a directory serialization format much like tar: https://superuser.com/questions/343915/tar-vs-cpio-what-is-the-difference
|
||||
|
||||
The bootloader, which for us is QEMU itself, is then configured to put that CPIO into memory, and tell the kernel that it is there.
|
||||
|
||||
With this setup, you don't even need to give a root filesystem to the kernel, it just does everything in memory in a ramfs.
|
||||
|
||||
Try it out with:
|
||||
|
||||
....
|
||||
./run -i
|
||||
....
|
||||
|
||||
Notice how it boots fine, even though `-drive` is not given.
|
||||
|
||||
Also as expected, there is no filesystem persistency, since we are doing everything in memory:
|
||||
|
||||
....
|
||||
date >f
|
||||
poweroff
|
||||
cat f
|
||||
# can't open 'f': No such file or directory
|
||||
....
|
||||
|
||||
This can be good for automated tests, as it ensures that you are using a pristine unmodified system image every time.
|
||||
|
||||
The main ingredients to get this working are:
|
||||
|
||||
* `BR2_TARGET_ROOTFS_CPIO=y`: make Buildroot generate `output/images/rootfs.cpio` in addition to the other images.
|
||||
+
|
||||
It is also possible to compress that image with other options.
|
||||
* `qemu -initrd`: make QEMU put the image into memory and tell the kernel about it.
|
||||
* `CONFIG_BLK_DEV_INITRD=y`: Compile the kernel with initrd support, see also: https://unix.stackexchange.com/questions/67462/linux-kernel-is-not-finding-the-initrd-correctly/424496#424496
|
||||
+
|
||||
Buildroot forces that option when `BR2_TARGET_ROOTFS_CPIO=y` is given
|
||||
|
||||
https://unix.stackexchange.com/questions/89923/how-does-linux-load-the-initrd-image asks how the mechanism works in more detail.
|
||||
|
||||
==== initrd in desktop distros
|
||||
|
||||
Most modern desktop distributions have an initrd in their root disk to do early setup.
|
||||
|
||||
The rationale for this is described at: https://en.wikipedia.org/wiki/Initial_ramdisk
|
||||
|
||||
One obvious use case is having an encrypted root filesystem: you keep the initrd in an unencrypted partition, and then setup decryption from there.
|
||||
|
||||
I think GRUB then knows read common disk formats, and then loads that initrd to memory with a `/boot/grub/grub.cfg` directive of type:
|
||||
|
||||
initrd /initrd.img-4.4.0-108-generic
|
||||
|
||||
Related: https://stackoverflow.com/questions/6405083/initrd-and-booting-the-linux-kernel
|
||||
|
||||
==== initramfs
|
||||
|
||||
initramfs is just like <<initrd>>, but you also glue the image directly to the kernel image itself.
|
||||
|
||||
So the only argument that QEMU needs is the `-kernel`, no `-drive` not even `-initrd`! Pretty cool.
|
||||
|
||||
Try it out with:
|
||||
|
||||
....
|
||||
./run -a aarch64
|
||||
....
|
||||
|
||||
since our <<aarch64>> setup uses it by default.
|
||||
|
||||
In the background, it uses `BR2_TARGET_ROOTFS_INITRAMFS`, and this makes the kernel config option `CONFIG_INITRAMFS_SOURCE` point to the CPIO that will be embedded in the kernel image.
|
||||
|
||||
http://nairobi-embedded.org/initramfs_tutorial.html shows a full manual setup.
|
||||
|
||||
=== ftrace
|
||||
|
||||
Trace a single function:
|
||||
|
||||
....
|
||||
cd /sys/kernel/debug/tracing/
|
||||
|
||||
# Stop tracing.
|
||||
echo 0 > tracing_on
|
||||
|
||||
# Clear previous trace.
|
||||
echo '' > trace
|
||||
|
||||
# List the available tracers, and pick one.
|
||||
cat available_tracers
|
||||
echo function > current_tracer
|
||||
|
||||
# List all functions that can be traced
|
||||
# cat available_filter_functions
|
||||
# Choose one.
|
||||
echo __kmalloc >set_ftrace_filter
|
||||
# Confirm that only __kmalloc is enabled.
|
||||
cat enabled_functions
|
||||
|
||||
echo 1 > tracing_on
|
||||
|
||||
# Latest events.
|
||||
head trace
|
||||
|
||||
# Observe trace continously, and drain seen events out.
|
||||
cat trace_pipe &
|
||||
....
|
||||
|
||||
Sample output:
|
||||
|
||||
....
|
||||
# tracer: function
|
||||
#
|
||||
# entries-in-buffer/entries-written: 97/97 #P:1
|
||||
#
|
||||
# _-----=> irqs-off
|
||||
# / _----=> need-resched
|
||||
# | / _---=> hardirq/softirq
|
||||
# || / _--=> preempt-depth
|
||||
# ||| / delay
|
||||
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
||||
# | | | |||| | |
|
||||
head-228 [000] .... 825.534637: __kmalloc <-load_elf_phdrs
|
||||
head-228 [000] .... 825.534692: __kmalloc <-load_elf_binary
|
||||
head-228 [000] .... 825.534815: __kmalloc <-load_elf_phdrs
|
||||
head-228 [000] .... 825.550917: __kmalloc <-__seq_open_private
|
||||
head-228 [000] .... 825.550953: __kmalloc <-tracing_open
|
||||
head-229 [000] .... 826.756585: __kmalloc <-load_elf_phdrs
|
||||
head-229 [000] .... 826.756627: __kmalloc <-load_elf_binary
|
||||
head-229 [000] .... 826.756719: __kmalloc <-load_elf_phdrs
|
||||
head-229 [000] .... 826.773796: __kmalloc <-__seq_open_private
|
||||
head-229 [000] .... 826.773835: __kmalloc <-tracing_open
|
||||
head-230 [000] .... 827.174988: __kmalloc <-load_elf_phdrs
|
||||
head-230 [000] .... 827.175046: __kmalloc <-load_elf_binary
|
||||
head-230 [000] .... 827.175171: __kmalloc <-load_elf_phdrs
|
||||
....
|
||||
|
||||
Trace all possible functions, and draw a call graph:
|
||||
|
||||
....
|
||||
echo 1 > max_graph_depth
|
||||
echo 1 > events/enable
|
||||
echo function_graph > current_tracer
|
||||
....
|
||||
|
||||
Sample output:
|
||||
|
||||
....
|
||||
# CPU DURATION FUNCTION CALLS
|
||||
# | | | | | | |
|
||||
0) 2.173 us | } /* ntp_tick_length */
|
||||
0) | timekeeping_update() {
|
||||
0) 4.176 us | ntp_get_next_leap();
|
||||
0) 5.016 us | update_vsyscall();
|
||||
0) | raw_notifier_call_chain() {
|
||||
0) 2.241 us | notifier_call_chain();
|
||||
0) + 19.879 us | }
|
||||
0) 3.144 us | update_fast_timekeeper();
|
||||
0) 2.738 us | update_fast_timekeeper();
|
||||
0) ! 117.147 us | }
|
||||
0) | _raw_spin_unlock_irqrestore() {
|
||||
0) 4.045 us | _raw_write_unlock_irqrestore();
|
||||
0) + 22.066 us | }
|
||||
0) ! 265.278 us | } /* update_wall_time */
|
||||
....
|
||||
|
||||
TODO: what do `+` and `!` mean?
|
||||
|
||||
Each `enable` under the `events/` tree enables a certain set of functions, the higher the `enable` more functions are enabled.
|
||||
|
||||
=== QEMU user mode
|
||||
|
||||
This has nothing to do with the Linux kernel, but it is cool:
|
||||
|
||||
....
|
||||
sudo apt-get install qemu-user
|
||||
./build -a arm
|
||||
cd buildroot/output.arm~/target
|
||||
qemu-arm -L . bin/ls
|
||||
....
|
||||
|
||||
This uses QEMU's user-mode emulation mode that allows us to run cross-compiled userland programs directly on the host.
|
||||
|
||||
The reason this is cool, is that `ls` is not statically compiled, but since we have the Buildroot image, we are still able to find the shared linker and the shared library at the given path.
|
||||
|
||||
In other words, much cooler than:
|
||||
|
||||
....
|
||||
arm-linux-gnueabi-gcc -o hello -static hello.c
|
||||
qemu-arm hello
|
||||
....
|
||||
|
||||
It is also possible to compile QEMU user mode from source with `BR2_PACKAGE_HOST_QEMU_LINUX_USER_MODE=y`, but then your compilation will likely fail with:
|
||||
|
||||
....
|
||||
package/qemu/qemu.mk:110: *** "Refusing to build qemu-user: target Linux version newer than host's.". Stop.
|
||||
....
|
||||
|
||||
since we are using a bleeding edge kernel, which is a sanity check in the Buildroot QEMU package.
|
||||
|
||||
Anyways, this warns us that the userland emulation will likely not be reliable, which is good to know. TODO: where is it documented the host kernel must be as new as the target one?
|
||||
|
||||
GDB step debugging is also possible with:
|
||||
|
||||
....
|
||||
qemu-arm -g 1234 -L . bin/ls
|
||||
../host/usr/bin/arm-buildroot-linux-uclibcgnueabi-gdb -ex 'target remote localhost:1234'
|
||||
....
|
||||
|
||||
TODO: find source. Lazy now.
|
||||
|
||||
=== Snapshot
|
||||
|
||||
https://stackoverflow.com/questions/40227651/does-qemu-emulator-have-checkpoint-function/48724371#48724371
|
||||
|
||||
QEMU allows us to take snapshots at any time through the monitor.
|
||||
|
||||
You can then restore CPU, memory and disk state back at any time.
|
||||
|
||||
qcow2 filesystems must be used for that to work.
|
||||
|
||||
To test it out, login into the VM with and run:
|
||||
|
||||
....
|
||||
/count.sh
|
||||
....
|
||||
|
||||
On another shell, take a snapshot:
|
||||
|
||||
....
|
||||
echo 'savevm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
The counting continues.
|
||||
|
||||
Restore the snapshot:
|
||||
|
||||
....
|
||||
echo 'loadvm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
and the counting goes back to where we saved. This shows that CPU and memory states were reverted.
|
||||
|
||||
We can also verify that the disk state is also reversed. Guest:
|
||||
|
||||
....
|
||||
echo 0 >f
|
||||
....
|
||||
|
||||
Monitor:
|
||||
|
||||
....
|
||||
echo 'savevm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
Guest:
|
||||
|
||||
....
|
||||
echo 1 >f
|
||||
....
|
||||
|
||||
Monitor:
|
||||
|
||||
....
|
||||
echo 'loadvm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
Guest:
|
||||
|
||||
....
|
||||
cat f
|
||||
....
|
||||
|
||||
And the output is `0`.
|
||||
|
||||
Our setup does not allow for snapshotting while using <<initrd>>.
|
||||
|
||||
=== GEM5
|
||||
|
||||
GEM5 is a system simulator, much like QEMU: http://gem5.org/
|
||||
@@ -1623,279 +1896,6 @@ Aborted (core dumped)
|
||||
|
||||
If we checkout to the ancient kernel `v2.6.22.9`, it fails to compile with modern GNU make 4.1: https://stackoverflow.com/questions/35002691/makefile-make-clean-why-getting-mixed-implicit-and-normal-rules-deprecated-s lol
|
||||
|
||||
=== initrd
|
||||
|
||||
The kernel can boot from an CPIO file, which is a directory serialization format much like tar: https://superuser.com/questions/343915/tar-vs-cpio-what-is-the-difference
|
||||
|
||||
The bootloader, which for us is QEMU itself, is then configured to put that CPIO into memory, and tell the kernel that it is there.
|
||||
|
||||
With this setup, you don't even need to give a root filesystem to the kernel, it just does everything in memory in a ramfs.
|
||||
|
||||
Try it out with:
|
||||
|
||||
....
|
||||
./run -i
|
||||
....
|
||||
|
||||
Notice how it boots fine, even though `-drive` is not given.
|
||||
|
||||
Also as expected, there is no filesystem persistency, since we are doing everything in memory:
|
||||
|
||||
....
|
||||
date >f
|
||||
poweroff
|
||||
cat f
|
||||
# can't open 'f': No such file or directory
|
||||
....
|
||||
|
||||
This can be good for automated tests, as it ensures that you are using a pristine unmodified system image every time.
|
||||
|
||||
The main ingredients to get this working are:
|
||||
|
||||
* `BR2_TARGET_ROOTFS_CPIO=y`: make Buildroot generate `output/images/rootfs.cpio` in addition to the other images.
|
||||
+
|
||||
It is also possible to compress that image with other options.
|
||||
* `qemu -initrd`: make QEMU put the image into memory and tell the kernel about it.
|
||||
* `CONFIG_BLK_DEV_INITRD=y`: Compile the kernel with initrd support, see also: https://unix.stackexchange.com/questions/67462/linux-kernel-is-not-finding-the-initrd-correctly/424496#424496
|
||||
+
|
||||
Buildroot forces that option when `BR2_TARGET_ROOTFS_CPIO=y` is given
|
||||
|
||||
https://unix.stackexchange.com/questions/89923/how-does-linux-load-the-initrd-image asks how the mechanism works in more detail.
|
||||
|
||||
==== initrd in desktop distros
|
||||
|
||||
Most modern desktop distributions have an initrd in their root disk to do early setup.
|
||||
|
||||
The rationale for this is described at: https://en.wikipedia.org/wiki/Initial_ramdisk
|
||||
|
||||
One obvious use case is having an encrypted root filesystem: you keep the initrd in an unencrypted partition, and then setup decryption from there.
|
||||
|
||||
I think GRUB then knows read common disk formats, and then loads that initrd to memory with a `/boot/grub/grub.cfg` directive of type:
|
||||
|
||||
initrd /initrd.img-4.4.0-108-generic
|
||||
|
||||
Related: https://stackoverflow.com/questions/6405083/initrd-and-booting-the-linux-kernel
|
||||
|
||||
==== initramfs
|
||||
|
||||
initramfs is just like <<initrd>>, but you also glue the image directly to the kernel image itself.
|
||||
|
||||
So the only argument that QEMU needs is the `-kernel`, no `-drive` not even `-initrd`! Pretty cool.
|
||||
|
||||
Try it out with:
|
||||
|
||||
....
|
||||
./run -a aarch64
|
||||
....
|
||||
|
||||
since our <<aarch64>> setup uses it by default.
|
||||
|
||||
In the background, it uses `BR2_TARGET_ROOTFS_INITRAMFS`, and this makes the kernel config option `CONFIG_INITRAMFS_SOURCE` point to the CPIO that will be embedded in the kernel image.
|
||||
|
||||
http://nairobi-embedded.org/initramfs_tutorial.html shows a full manual setup.
|
||||
|
||||
=== ftrace
|
||||
|
||||
Trace a single function:
|
||||
|
||||
....
|
||||
cd /sys/kernel/debug/tracing/
|
||||
|
||||
# Stop tracing.
|
||||
echo 0 > tracing_on
|
||||
|
||||
# Clear previous trace.
|
||||
echo '' > trace
|
||||
|
||||
# List the available tracers, and pick one.
|
||||
cat available_tracers
|
||||
echo function > current_tracer
|
||||
|
||||
# List all functions that can be traced
|
||||
# cat available_filter_functions
|
||||
# Choose one.
|
||||
echo __kmalloc >set_ftrace_filter
|
||||
# Confirm that only __kmalloc is enabled.
|
||||
cat enabled_functions
|
||||
|
||||
echo 1 > tracing_on
|
||||
|
||||
# Latest events.
|
||||
head trace
|
||||
|
||||
# Observe trace continously, and drain seen events out.
|
||||
cat trace_pipe &
|
||||
....
|
||||
|
||||
Sample output:
|
||||
|
||||
....
|
||||
# tracer: function
|
||||
#
|
||||
# entries-in-buffer/entries-written: 97/97 #P:1
|
||||
#
|
||||
# _-----=> irqs-off
|
||||
# / _----=> need-resched
|
||||
# | / _---=> hardirq/softirq
|
||||
# || / _--=> preempt-depth
|
||||
# ||| / delay
|
||||
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
||||
# | | | |||| | |
|
||||
head-228 [000] .... 825.534637: __kmalloc <-load_elf_phdrs
|
||||
head-228 [000] .... 825.534692: __kmalloc <-load_elf_binary
|
||||
head-228 [000] .... 825.534815: __kmalloc <-load_elf_phdrs
|
||||
head-228 [000] .... 825.550917: __kmalloc <-__seq_open_private
|
||||
head-228 [000] .... 825.550953: __kmalloc <-tracing_open
|
||||
head-229 [000] .... 826.756585: __kmalloc <-load_elf_phdrs
|
||||
head-229 [000] .... 826.756627: __kmalloc <-load_elf_binary
|
||||
head-229 [000] .... 826.756719: __kmalloc <-load_elf_phdrs
|
||||
head-229 [000] .... 826.773796: __kmalloc <-__seq_open_private
|
||||
head-229 [000] .... 826.773835: __kmalloc <-tracing_open
|
||||
head-230 [000] .... 827.174988: __kmalloc <-load_elf_phdrs
|
||||
head-230 [000] .... 827.175046: __kmalloc <-load_elf_binary
|
||||
head-230 [000] .... 827.175171: __kmalloc <-load_elf_phdrs
|
||||
....
|
||||
|
||||
Trace all possible functions, and draw a call graph:
|
||||
|
||||
....
|
||||
echo 1 > max_graph_depth
|
||||
echo 1 > events/enable
|
||||
echo function_graph > current_tracer
|
||||
....
|
||||
|
||||
Sample output:
|
||||
|
||||
....
|
||||
# CPU DURATION FUNCTION CALLS
|
||||
# | | | | | | |
|
||||
0) 2.173 us | } /* ntp_tick_length */
|
||||
0) | timekeeping_update() {
|
||||
0) 4.176 us | ntp_get_next_leap();
|
||||
0) 5.016 us | update_vsyscall();
|
||||
0) | raw_notifier_call_chain() {
|
||||
0) 2.241 us | notifier_call_chain();
|
||||
0) + 19.879 us | }
|
||||
0) 3.144 us | update_fast_timekeeper();
|
||||
0) 2.738 us | update_fast_timekeeper();
|
||||
0) ! 117.147 us | }
|
||||
0) | _raw_spin_unlock_irqrestore() {
|
||||
0) 4.045 us | _raw_write_unlock_irqrestore();
|
||||
0) + 22.066 us | }
|
||||
0) ! 265.278 us | } /* update_wall_time */
|
||||
....
|
||||
|
||||
TODO: what do `+` and `!` mean?
|
||||
|
||||
Each `enable` under the `events/` tree enables a certain set of functions, the higher the `enable` more functions are enabled.
|
||||
|
||||
=== QEMU user mode
|
||||
|
||||
This has nothing to do with the Linux kernel, but it is cool:
|
||||
|
||||
....
|
||||
sudo apt-get install qemu-user
|
||||
./build -a arm
|
||||
cd buildroot/output.arm~/target
|
||||
qemu-arm -L . bin/ls
|
||||
....
|
||||
|
||||
This uses QEMU's user-mode emulation mode that allows us to run cross-compiled userland programs directly on the host.
|
||||
|
||||
The reason this is cool, is that `ls` is not statically compiled, but since we have the Buildroot image, we are still able to find the shared linker and the shared library at the given path.
|
||||
|
||||
In other words, much cooler than:
|
||||
|
||||
....
|
||||
arm-linux-gnueabi-gcc -o hello -static hello.c
|
||||
qemu-arm hello
|
||||
....
|
||||
|
||||
It is also possible to compile QEMU user mode from source with `BR2_PACKAGE_HOST_QEMU_LINUX_USER_MODE=y`, but then your compilation will likely fail with:
|
||||
|
||||
....
|
||||
package/qemu/qemu.mk:110: *** "Refusing to build qemu-user: target Linux version newer than host's.". Stop.
|
||||
....
|
||||
|
||||
since we are using a bleeding edge kernel, which is a sanity check in the Buildroot QEMU package.
|
||||
|
||||
Anyways, this warns us that the userland emulation will likely not be reliable, which is good to know. TODO: where is it documented the host kernel must be as new as the target one?
|
||||
|
||||
GDB step debugging is also possible with:
|
||||
|
||||
....
|
||||
qemu-arm -g 1234 -L . bin/ls
|
||||
../host/usr/bin/arm-buildroot-linux-uclibcgnueabi-gdb -ex 'target remote localhost:1234'
|
||||
....
|
||||
|
||||
TODO: find source. Lazy now.
|
||||
|
||||
=== Snapshot
|
||||
|
||||
https://stackoverflow.com/questions/40227651/does-qemu-emulator-have-checkpoint-function/48724371#48724371
|
||||
|
||||
QEMU allows us to take snapshots at any time through the monitor.
|
||||
|
||||
You can then restore CPU, memory and disk state back at any time.
|
||||
|
||||
qcow2 filesystems must be used for that to work.
|
||||
|
||||
To test it out, login into the VM with and run:
|
||||
|
||||
....
|
||||
/count.sh
|
||||
....
|
||||
|
||||
On another shell, take a snapshot:
|
||||
|
||||
....
|
||||
echo 'savevm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
The counting continues.
|
||||
|
||||
Restore the snapshot:
|
||||
|
||||
....
|
||||
echo 'loadvm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
and the counting goes back to where we saved. This shows that CPU and memory states were reverted.
|
||||
|
||||
We can also verify that the disk state is also reversed. Guest:
|
||||
|
||||
....
|
||||
echo 0 >f
|
||||
....
|
||||
|
||||
Monitor:
|
||||
|
||||
....
|
||||
echo 'savevm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
Guest:
|
||||
|
||||
....
|
||||
echo 1 >f
|
||||
....
|
||||
|
||||
Monitor:
|
||||
|
||||
....
|
||||
echo 'loadvm my_snap_id' | ./qemumonitor
|
||||
....
|
||||
|
||||
Guest:
|
||||
|
||||
....
|
||||
cat f
|
||||
....
|
||||
|
||||
And the output is `0`.
|
||||
|
||||
Our setup does not allow for snapshotting while using <<initrd>>.
|
||||
|
||||
== Failed action
|
||||
|
||||
=== Record and replay
|
||||
|
||||
Reference in New Issue
Block a user