diff --git a/README.adoc b/README.adoc index 27ce83f..224df2c 100644 --- a/README.adoc +++ b/README.adoc @@ -1197,6 +1197,279 @@ Call Trace: in which the boot appears to hang for a considerable time. * Confirm that the kernel enters at `0x1000000`, or where it enters. Once we have this, we can exclude what comes before in the BIOS. +=== initrd + +The kernel can boot from an CPIO file, which is a directory serialization format much like tar: https://superuser.com/questions/343915/tar-vs-cpio-what-is-the-difference + +The bootloader, which for us is QEMU itself, is then configured to put that CPIO into memory, and tell the kernel that it is there. + +With this setup, you don't even need to give a root filesystem to the kernel, it just does everything in memory in a ramfs. + +Try it out with: + +.... +./run -i +.... + +Notice how it boots fine, even though `-drive` is not given. + +Also as expected, there is no filesystem persistency, since we are doing everything in memory: + +.... +date >f +poweroff +cat f +# can't open 'f': No such file or directory +.... + +This can be good for automated tests, as it ensures that you are using a pristine unmodified system image every time. + +The main ingredients to get this working are: + +* `BR2_TARGET_ROOTFS_CPIO=y`: make Buildroot generate `output/images/rootfs.cpio` in addition to the other images. ++ +It is also possible to compress that image with other options. +* `qemu -initrd`: make QEMU put the image into memory and tell the kernel about it. +* `CONFIG_BLK_DEV_INITRD=y`: Compile the kernel with initrd support, see also: https://unix.stackexchange.com/questions/67462/linux-kernel-is-not-finding-the-initrd-correctly/424496#424496 ++ +Buildroot forces that option when `BR2_TARGET_ROOTFS_CPIO=y` is given + +https://unix.stackexchange.com/questions/89923/how-does-linux-load-the-initrd-image asks how the mechanism works in more detail. + +==== initrd in desktop distros + +Most modern desktop distributions have an initrd in their root disk to do early setup. + +The rationale for this is described at: https://en.wikipedia.org/wiki/Initial_ramdisk + +One obvious use case is having an encrypted root filesystem: you keep the initrd in an unencrypted partition, and then setup decryption from there. + +I think GRUB then knows read common disk formats, and then loads that initrd to memory with a `/boot/grub/grub.cfg` directive of type: + + initrd /initrd.img-4.4.0-108-generic + +Related: https://stackoverflow.com/questions/6405083/initrd-and-booting-the-linux-kernel + +==== initramfs + +initramfs is just like <>, but you also glue the image directly to the kernel image itself. + +So the only argument that QEMU needs is the `-kernel`, no `-drive` not even `-initrd`! Pretty cool. + +Try it out with: + +.... +./run -a aarch64 +.... + +since our <> setup uses it by default. + +In the background, it uses `BR2_TARGET_ROOTFS_INITRAMFS`, and this makes the kernel config option `CONFIG_INITRAMFS_SOURCE` point to the CPIO that will be embedded in the kernel image. + +http://nairobi-embedded.org/initramfs_tutorial.html shows a full manual setup. + +=== ftrace + +Trace a single function: + +.... +cd /sys/kernel/debug/tracing/ + +# Stop tracing. +echo 0 > tracing_on + +# Clear previous trace. +echo '' > trace + +# List the available tracers, and pick one. +cat available_tracers +echo function > current_tracer + +# List all functions that can be traced +# cat available_filter_functions +# Choose one. +echo __kmalloc >set_ftrace_filter +# Confirm that only __kmalloc is enabled. +cat enabled_functions + +echo 1 > tracing_on + +# Latest events. +head trace + +# Observe trace continously, and drain seen events out. +cat trace_pipe & +.... + +Sample output: + +.... +# tracer: function +# +# entries-in-buffer/entries-written: 97/97 #P:1 +# +# _-----=> irqs-off +# / _----=> need-resched +# | / _---=> hardirq/softirq +# || / _--=> preempt-depth +# ||| / delay +# TASK-PID CPU# |||| TIMESTAMP FUNCTION +# | | | |||| | | + head-228 [000] .... 825.534637: __kmalloc <-load_elf_phdrs + head-228 [000] .... 825.534692: __kmalloc <-load_elf_binary + head-228 [000] .... 825.534815: __kmalloc <-load_elf_phdrs + head-228 [000] .... 825.550917: __kmalloc <-__seq_open_private + head-228 [000] .... 825.550953: __kmalloc <-tracing_open + head-229 [000] .... 826.756585: __kmalloc <-load_elf_phdrs + head-229 [000] .... 826.756627: __kmalloc <-load_elf_binary + head-229 [000] .... 826.756719: __kmalloc <-load_elf_phdrs + head-229 [000] .... 826.773796: __kmalloc <-__seq_open_private + head-229 [000] .... 826.773835: __kmalloc <-tracing_open + head-230 [000] .... 827.174988: __kmalloc <-load_elf_phdrs + head-230 [000] .... 827.175046: __kmalloc <-load_elf_binary + head-230 [000] .... 827.175171: __kmalloc <-load_elf_phdrs +.... + +Trace all possible functions, and draw a call graph: + +.... +echo 1 > max_graph_depth +echo 1 > events/enable +echo function_graph > current_tracer +.... + +Sample output: + +.... +# CPU DURATION FUNCTION CALLS +# | | | | | | | + 0) 2.173 us | } /* ntp_tick_length */ + 0) | timekeeping_update() { + 0) 4.176 us | ntp_get_next_leap(); + 0) 5.016 us | update_vsyscall(); + 0) | raw_notifier_call_chain() { + 0) 2.241 us | notifier_call_chain(); + 0) + 19.879 us | } + 0) 3.144 us | update_fast_timekeeper(); + 0) 2.738 us | update_fast_timekeeper(); + 0) ! 117.147 us | } + 0) | _raw_spin_unlock_irqrestore() { + 0) 4.045 us | _raw_write_unlock_irqrestore(); + 0) + 22.066 us | } + 0) ! 265.278 us | } /* update_wall_time */ +.... + +TODO: what do `+` and `!` mean? + +Each `enable` under the `events/` tree enables a certain set of functions, the higher the `enable` more functions are enabled. + +=== QEMU user mode + +This has nothing to do with the Linux kernel, but it is cool: + +.... +sudo apt-get install qemu-user +./build -a arm +cd buildroot/output.arm~/target +qemu-arm -L . bin/ls +.... + +This uses QEMU's user-mode emulation mode that allows us to run cross-compiled userland programs directly on the host. + +The reason this is cool, is that `ls` is not statically compiled, but since we have the Buildroot image, we are still able to find the shared linker and the shared library at the given path. + +In other words, much cooler than: + +.... +arm-linux-gnueabi-gcc -o hello -static hello.c +qemu-arm hello +.... + +It is also possible to compile QEMU user mode from source with `BR2_PACKAGE_HOST_QEMU_LINUX_USER_MODE=y`, but then your compilation will likely fail with: + +.... +package/qemu/qemu.mk:110: *** "Refusing to build qemu-user: target Linux version newer than host's.". Stop. +.... + +since we are using a bleeding edge kernel, which is a sanity check in the Buildroot QEMU package. + +Anyways, this warns us that the userland emulation will likely not be reliable, which is good to know. TODO: where is it documented the host kernel must be as new as the target one? + +GDB step debugging is also possible with: + +.... +qemu-arm -g 1234 -L . bin/ls +../host/usr/bin/arm-buildroot-linux-uclibcgnueabi-gdb -ex 'target remote localhost:1234' +.... + +TODO: find source. Lazy now. + +=== Snapshot + +https://stackoverflow.com/questions/40227651/does-qemu-emulator-have-checkpoint-function/48724371#48724371 + +QEMU allows us to take snapshots at any time through the monitor. + +You can then restore CPU, memory and disk state back at any time. + +qcow2 filesystems must be used for that to work. + +To test it out, login into the VM with and run: + +.... +/count.sh +.... + +On another shell, take a snapshot: + +.... +echo 'savevm my_snap_id' | ./qemumonitor +.... + +The counting continues. + +Restore the snapshot: + +.... +echo 'loadvm my_snap_id' | ./qemumonitor +.... + +and the counting goes back to where we saved. This shows that CPU and memory states were reverted. + +We can also verify that the disk state is also reversed. Guest: + +.... +echo 0 >f +.... + +Monitor: + +.... +echo 'savevm my_snap_id' | ./qemumonitor +.... + +Guest: + +.... +echo 1 >f +.... + +Monitor: + +.... +echo 'loadvm my_snap_id' | ./qemumonitor +.... + +Guest: + +.... +cat f +.... + +And the output is `0`. + +Our setup does not allow for snapshotting while using <>. + === GEM5 GEM5 is a system simulator, much like QEMU: http://gem5.org/ @@ -1623,279 +1896,6 @@ Aborted (core dumped) If we checkout to the ancient kernel `v2.6.22.9`, it fails to compile with modern GNU make 4.1: https://stackoverflow.com/questions/35002691/makefile-make-clean-why-getting-mixed-implicit-and-normal-rules-deprecated-s lol -=== initrd - -The kernel can boot from an CPIO file, which is a directory serialization format much like tar: https://superuser.com/questions/343915/tar-vs-cpio-what-is-the-difference - -The bootloader, which for us is QEMU itself, is then configured to put that CPIO into memory, and tell the kernel that it is there. - -With this setup, you don't even need to give a root filesystem to the kernel, it just does everything in memory in a ramfs. - -Try it out with: - -.... -./run -i -.... - -Notice how it boots fine, even though `-drive` is not given. - -Also as expected, there is no filesystem persistency, since we are doing everything in memory: - -.... -date >f -poweroff -cat f -# can't open 'f': No such file or directory -.... - -This can be good for automated tests, as it ensures that you are using a pristine unmodified system image every time. - -The main ingredients to get this working are: - -* `BR2_TARGET_ROOTFS_CPIO=y`: make Buildroot generate `output/images/rootfs.cpio` in addition to the other images. -+ -It is also possible to compress that image with other options. -* `qemu -initrd`: make QEMU put the image into memory and tell the kernel about it. -* `CONFIG_BLK_DEV_INITRD=y`: Compile the kernel with initrd support, see also: https://unix.stackexchange.com/questions/67462/linux-kernel-is-not-finding-the-initrd-correctly/424496#424496 -+ -Buildroot forces that option when `BR2_TARGET_ROOTFS_CPIO=y` is given - -https://unix.stackexchange.com/questions/89923/how-does-linux-load-the-initrd-image asks how the mechanism works in more detail. - -==== initrd in desktop distros - -Most modern desktop distributions have an initrd in their root disk to do early setup. - -The rationale for this is described at: https://en.wikipedia.org/wiki/Initial_ramdisk - -One obvious use case is having an encrypted root filesystem: you keep the initrd in an unencrypted partition, and then setup decryption from there. - -I think GRUB then knows read common disk formats, and then loads that initrd to memory with a `/boot/grub/grub.cfg` directive of type: - - initrd /initrd.img-4.4.0-108-generic - -Related: https://stackoverflow.com/questions/6405083/initrd-and-booting-the-linux-kernel - -==== initramfs - -initramfs is just like <>, but you also glue the image directly to the kernel image itself. - -So the only argument that QEMU needs is the `-kernel`, no `-drive` not even `-initrd`! Pretty cool. - -Try it out with: - -.... -./run -a aarch64 -.... - -since our <> setup uses it by default. - -In the background, it uses `BR2_TARGET_ROOTFS_INITRAMFS`, and this makes the kernel config option `CONFIG_INITRAMFS_SOURCE` point to the CPIO that will be embedded in the kernel image. - -http://nairobi-embedded.org/initramfs_tutorial.html shows a full manual setup. - -=== ftrace - -Trace a single function: - -.... -cd /sys/kernel/debug/tracing/ - -# Stop tracing. -echo 0 > tracing_on - -# Clear previous trace. -echo '' > trace - -# List the available tracers, and pick one. -cat available_tracers -echo function > current_tracer - -# List all functions that can be traced -# cat available_filter_functions -# Choose one. -echo __kmalloc >set_ftrace_filter -# Confirm that only __kmalloc is enabled. -cat enabled_functions - -echo 1 > tracing_on - -# Latest events. -head trace - -# Observe trace continously, and drain seen events out. -cat trace_pipe & -.... - -Sample output: - -.... -# tracer: function -# -# entries-in-buffer/entries-written: 97/97 #P:1 -# -# _-----=> irqs-off -# / _----=> need-resched -# | / _---=> hardirq/softirq -# || / _--=> preempt-depth -# ||| / delay -# TASK-PID CPU# |||| TIMESTAMP FUNCTION -# | | | |||| | | - head-228 [000] .... 825.534637: __kmalloc <-load_elf_phdrs - head-228 [000] .... 825.534692: __kmalloc <-load_elf_binary - head-228 [000] .... 825.534815: __kmalloc <-load_elf_phdrs - head-228 [000] .... 825.550917: __kmalloc <-__seq_open_private - head-228 [000] .... 825.550953: __kmalloc <-tracing_open - head-229 [000] .... 826.756585: __kmalloc <-load_elf_phdrs - head-229 [000] .... 826.756627: __kmalloc <-load_elf_binary - head-229 [000] .... 826.756719: __kmalloc <-load_elf_phdrs - head-229 [000] .... 826.773796: __kmalloc <-__seq_open_private - head-229 [000] .... 826.773835: __kmalloc <-tracing_open - head-230 [000] .... 827.174988: __kmalloc <-load_elf_phdrs - head-230 [000] .... 827.175046: __kmalloc <-load_elf_binary - head-230 [000] .... 827.175171: __kmalloc <-load_elf_phdrs -.... - -Trace all possible functions, and draw a call graph: - -.... -echo 1 > max_graph_depth -echo 1 > events/enable -echo function_graph > current_tracer -.... - -Sample output: - -.... -# CPU DURATION FUNCTION CALLS -# | | | | | | | - 0) 2.173 us | } /* ntp_tick_length */ - 0) | timekeeping_update() { - 0) 4.176 us | ntp_get_next_leap(); - 0) 5.016 us | update_vsyscall(); - 0) | raw_notifier_call_chain() { - 0) 2.241 us | notifier_call_chain(); - 0) + 19.879 us | } - 0) 3.144 us | update_fast_timekeeper(); - 0) 2.738 us | update_fast_timekeeper(); - 0) ! 117.147 us | } - 0) | _raw_spin_unlock_irqrestore() { - 0) 4.045 us | _raw_write_unlock_irqrestore(); - 0) + 22.066 us | } - 0) ! 265.278 us | } /* update_wall_time */ -.... - -TODO: what do `+` and `!` mean? - -Each `enable` under the `events/` tree enables a certain set of functions, the higher the `enable` more functions are enabled. - -=== QEMU user mode - -This has nothing to do with the Linux kernel, but it is cool: - -.... -sudo apt-get install qemu-user -./build -a arm -cd buildroot/output.arm~/target -qemu-arm -L . bin/ls -.... - -This uses QEMU's user-mode emulation mode that allows us to run cross-compiled userland programs directly on the host. - -The reason this is cool, is that `ls` is not statically compiled, but since we have the Buildroot image, we are still able to find the shared linker and the shared library at the given path. - -In other words, much cooler than: - -.... -arm-linux-gnueabi-gcc -o hello -static hello.c -qemu-arm hello -.... - -It is also possible to compile QEMU user mode from source with `BR2_PACKAGE_HOST_QEMU_LINUX_USER_MODE=y`, but then your compilation will likely fail with: - -.... -package/qemu/qemu.mk:110: *** "Refusing to build qemu-user: target Linux version newer than host's.". Stop. -.... - -since we are using a bleeding edge kernel, which is a sanity check in the Buildroot QEMU package. - -Anyways, this warns us that the userland emulation will likely not be reliable, which is good to know. TODO: where is it documented the host kernel must be as new as the target one? - -GDB step debugging is also possible with: - -.... -qemu-arm -g 1234 -L . bin/ls -../host/usr/bin/arm-buildroot-linux-uclibcgnueabi-gdb -ex 'target remote localhost:1234' -.... - -TODO: find source. Lazy now. - -=== Snapshot - -https://stackoverflow.com/questions/40227651/does-qemu-emulator-have-checkpoint-function/48724371#48724371 - -QEMU allows us to take snapshots at any time through the monitor. - -You can then restore CPU, memory and disk state back at any time. - -qcow2 filesystems must be used for that to work. - -To test it out, login into the VM with and run: - -.... -/count.sh -.... - -On another shell, take a snapshot: - -.... -echo 'savevm my_snap_id' | ./qemumonitor -.... - -The counting continues. - -Restore the snapshot: - -.... -echo 'loadvm my_snap_id' | ./qemumonitor -.... - -and the counting goes back to where we saved. This shows that CPU and memory states were reverted. - -We can also verify that the disk state is also reversed. Guest: - -.... -echo 0 >f -.... - -Monitor: - -.... -echo 'savevm my_snap_id' | ./qemumonitor -.... - -Guest: - -.... -echo 1 >f -.... - -Monitor: - -.... -echo 'loadvm my_snap_id' | ./qemumonitor -.... - -Guest: - -.... -cat f -.... - -And the output is `0`. - -Our setup does not allow for snapshotting while using <>. - == Failed action === Record and replay