qemu-baremetal-cli-args

QEMU part done https://github.com/cirosantilli/linux-kernel-module-cheat/issues/67
This commit is contained in:
Ciro Santilli 六四事件 法轮功
2020-04-02 01:00:00 +00:00
parent e25e79c26b
commit 637ef640bf
7 changed files with 161 additions and 70 deletions

View File

@@ -1170,7 +1170,7 @@ sudo apt-get install gcc-aarch64-linux-gnu qemu-system-aarch64
--qemu-which host \ --qemu-which host \
--userland-build-id host \ --userland-build-id host \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
.... ....
@@ -1204,7 +1204,7 @@ and <<user-mode-gdb>>:
--qemu-which host \ --qemu-which host \
--userland-build-id host \ --userland-build-id host \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
.... ....
@@ -3699,7 +3699,7 @@ Let's run link:userland/c/command_line_arguments.c[] built with the Buildroot to
./build user-mode-qemu ./build user-mode-qemu
./run \ ./run \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args='asdf "qw er"' \ --cli-args='asdf "qw er"' \
; ;
.... ....
@@ -3730,7 +3730,7 @@ It's nice when <<gdb,the obvious>> just works, right?
--arch aarch64 \ --arch aarch64 \
--gdb-wait \ --gdb-wait \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
.... ....
@@ -3751,7 +3751,7 @@ Or alternatively, if you are using <<tmux>>, do everything in one go with:
--arch aarch64 \ --arch aarch64 \
--gdb \ --gdb \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
.... ....
@@ -3795,7 +3795,7 @@ If you followed <<qemu-buildroot-setup>>, you can now run the executables create
.... ....
./run \ ./run \
--userland "$(./getvar buildroot_target_dir)/bin/echo" \ --userland "$(./getvar buildroot_target_dir)/bin/echo" \
--userland-args='asdf' \ --cli-args='asdf' \
; ;
.... ....
@@ -3815,7 +3815,7 @@ or:
./run \ ./run \
--arch aarch64 \ --arch aarch64 \
--userland "$(./getvar --arch aarch64 buildroot_target_dir)/bin/sh" \ --userland "$(./getvar --arch aarch64 buildroot_target_dir)/bin/sh" \
--userland-args='-c "uname -a && pwd"' \ --cli-args='-c "uname -a && pwd"' \
; ;
.... ....
@@ -3903,7 +3903,7 @@ Example:
--arch aarch64 \ --arch aarch64 \
--static \ --static \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
.... ....
@@ -4038,7 +4038,7 @@ So let's just play with some static ones:
--arch aarch64 \ --arch aarch64 \
--emulator gem5 \ --emulator gem5 \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
.... ....
@@ -4052,7 +4052,7 @@ TODO: how to escape spaces on the command line arguments?
--emulator gem5 \ --emulator gem5 \
--gdb-wait \ --gdb-wait \
--userland userland/c/command_line_arguments.c \ --userland userland/c/command_line_arguments.c \
--userland-args 'asdf "qw er"' \ --cli-args 'asdf "qw er"' \
; ;
./run-gdb \ ./run-gdb \
--arch aarch64 \ --arch aarch64 \
@@ -4137,7 +4137,7 @@ so we see that two syscall lines were added for each syscall, showing the syscal
At 8d8307ac0710164701f6e14c99a69ee172ccbb70 + 1, I noticed that if you run link:userland/posix/count.c[]: At 8d8307ac0710164701f6e14c99a69ee172ccbb70 + 1, I noticed that if you run link:userland/posix/count.c[]:
.... ....
./run --userland userland/posix/count_to.c --userland-args 3 ./run --userland userland/posix/count_to.c --cli-args 3
.... ....
it first waits for 3 seconds, then the program exits, and then it dumps all the stdout at once, instead of counting once every second as expected. it first waits for 3 seconds, then the program exits, and then it dumps all the stdout at once, instead of counting once every second as expected.
@@ -10861,7 +10861,7 @@ This random page suggests that QEMU splits one host thread thread per guest thre
We can confirm that with: We can confirm that with:
.... ....
./run --userland userland/posix/pthread_count.c --userland-args 4 ./run --userland userland/posix/pthread_count.c --cli-args 4
ps Haux | grep qemu | wc ps Haux | grep qemu | wc
.... ....
@@ -10878,7 +10878,7 @@ In gem5 syscall simulation, the `fork` syscall checks if there is a free CPU, an
For example, if we use just one CPU for link:userland/posix/pthread_self.c[] which spawns one thread besides `main`: For example, if we use just one CPU for link:userland/posix/pthread_self.c[] which spawns one thread besides `main`:
.... ....
./run --cpus 1 --emulator gem5 --userland userland/posix/pthread_self.c --userland-args 1 ./run --cpus 1 --emulator gem5 --userland userland/posix/pthread_self.c --cli-args 1
.... ....
fails with this error message coming from the guest stderr: fails with this error message coming from the guest stderr:
@@ -10890,13 +10890,13 @@ pthread_create: Resource temporarily unavailable
It works however if we add on extra CPU: It works however if we add on extra CPU:
.... ....
./run --cpus 2 --emulator gem5 --userland userland/posix/pthread_self.c --userland-args 1 ./run --cpus 2 --emulator gem5 --userland userland/posix/pthread_self.c --cli-args 1
.... ....
Once threads exit, their CPU is freed and becomes available for new `fork` calls: For example, the following run spawns a thread, joins it, and then spawns again, and 2 CPUs are enough: Once threads exit, their CPU is freed and becomes available for new `fork` calls: For example, the following run spawns a thread, joins it, and then spawns again, and 2 CPUs are enough:
.... ....
./run --cpus 2 --emulator gem5 --userland userland/posix/pthread_self.c --userland-args '1 2' ./run --cpus 2 --emulator gem5 --userland userland/posix/pthread_self.c --cli-args '1 2'
.... ....
because at each point in time, only up to two threads are running. because at each point in time, only up to two threads are running.
@@ -11742,7 +11742,7 @@ In LKMC we build `m5` with:
The `m5` executable can be run on <<user-mode-simulation>> as normal with: The `m5` executable can be run on <<user-mode-simulation>> as normal with:
.... ....
./run --arch aarch64 --emulator gem5 --userland "$(./getvar --arch aarch64 out_rootfs_overlay_bin_dir)/m5" --userland-args dumpstats ./run --arch aarch64 --emulator gem5 --userland "$(./getvar --arch aarch64 out_rootfs_overlay_bin_dir)/m5" --cli-args dumpstats
.... ....
This can be a good test <<m5ops>> since it executes very quickly. This can be a good test <<m5ops>> since it executes very quickly.
@@ -12250,7 +12250,7 @@ TODO what is the advantage? The generated file for `--stats-file h5://stats.h5?d
We then try to see if it is any better when you have a bunch of dump events: We then try to see if it is any better when you have a bunch of dump events:
.... ....
./run --arch aarch64 --emulator gem5 --userland userland/c/m5ops.c --userland-args 'd 1000' ./run --arch aarch64 --emulator gem5 --userland userland/c/m5ops.c --cli-args 'd 1000'
.... ....
and there yes, we see that the file size fell from 39MB on `stats.txt` to 3.2MB on `stats.m5`, so the increase observed previously was just due to some initial size overhead (considering the patched gem5 with no spaces in the text file). and there yes, we see that the file size fell from 39MB on `stats.txt` to 3.2MB on `stats.m5`, so the increase observed previously was just due to some initial size overhead (considering the patched gem5 with no spaces in the text file).
@@ -12594,7 +12594,7 @@ The message also shows on <<user-mode-simulation>> deadlocks, for example in lin
./run \ ./run \
--emulator gem5 \ --emulator gem5 \
--userland userland/posix/pthread_deadlock.c \ --userland userland/posix/pthread_deadlock.c \
--userland-args 1 \ --cli-args 1 \
; ;
.... ....
@@ -16017,7 +16017,7 @@ or:
Non-interactive usage: Non-interactive usage:
.... ....
./run --userland "$(./getvar buildroot_target_dir)/usr/bin/python3" --userland-args rootfs_overlay/lkmc/python/hello.py ./run --userland "$(./getvar buildroot_target_dir)/usr/bin/python3" --cli-args rootfs_overlay/lkmc/python/hello.py
.... ....
===== Python gem5 user mode simulation ===== Python gem5 user mode simulation
@@ -16028,7 +16028,7 @@ At LKMC 50ac89b779363774325c81157ec8b9a6bdb50a2f gem5 390a74f59934b85d91489f8a56
./run \ ./run \
--emulator gem5 \ --emulator gem5 \
--userland "$(buildroot_target_dir)/usr/bin/python3" \ --userland "$(buildroot_target_dir)/usr/bin/python3" \
--userland-args rootfs_overlay/lkmc/python/hello.py \ --cli-args rootfs_overlay/lkmc/python/hello.py \
; ;
.... ....
@@ -16047,7 +16047,7 @@ and aarch64:
--arch aarch64 \ --arch aarch64 \
--emulator gem5 \ --emulator gem5 \
--userland "$(./getvar --arch aarch64 buildroot_target_dir)/usr/bin/python3" \ --userland "$(./getvar --arch aarch64 buildroot_target_dir)/usr/bin/python3" \
--userland-args rootfs_overlay/lkmc/python/hello.py \ --cli-args rootfs_overlay/lkmc/python/hello.py \
; ;
.... ....
@@ -16285,7 +16285,7 @@ To benchmark on gem5, we first build the benchmark with <<m5ops-instructions>> e
--arch x86_64 \ --arch x86_64 \
--emulator gem5 \ --emulator gem5 \
--userland userland/cpp/bst_vs_heap_vs_hashmap.cpp \ --userland userland/cpp/bst_vs_heap_vs_hashmap.cpp \
--userland-args='100000 1 0' \ --cli-args='100000 1 0' \
-- \ -- \
--cpu-type=DerivO3CPU \ --cpu-type=DerivO3CPU \
--caches \ --caches \
@@ -16432,7 +16432,7 @@ TODO automate run more nicely to dispense `getvar`.
Increase the number of loops to try and reach more meaningful results: Increase the number of loops to try and reach more meaningful results:
.... ....
./run --userland "$(./getvar userland_build_dir)/submodules/dhrystone/dhrystone" --userland-args 100000000 ./run --userland "$(./getvar userland_build_dir)/submodules/dhrystone/dhrystone" --cli-args 100000000
.... ....
Build and run on gem5 user mode: Build and run on gem5 user mode:
@@ -16522,14 +16522,14 @@ git submodule update --init submodules/stream-benchmark
Decrease the benchmark size and the retry count to finish simulation faster, but possibly have a less representative result: Decrease the benchmark size and the retry count to finish simulation faster, but possibly have a less representative result:
.... ....
./run --userland "$(./getvar userland_build_dir)/submodules/stream-benchmark/stream_c.exe" --userland-args '100 2' ./run --userland "$(./getvar userland_build_dir)/submodules/stream-benchmark/stream_c.exe" --cli-args '100 2'
.... ....
Build and run on gem5 user mode: Build and run on gem5 user mode:
.... ....
./build-stream --optimization-level 3 ./build-stream --optimization-level 3
./run --emulator gem5 --userland "$(./getvar userland_build_dir)/submodules/stream-benchmark/stream_c.exe" --userland-args '1000 2' ./run --emulator gem5 --userland "$(./getvar userland_build_dir)/submodules/stream-benchmark/stream_c.exe" --cli-args '1000 2'
.... ....
==== PARSEC benchmark ==== PARSEC benchmark
@@ -19503,11 +19503,39 @@ Those for example are required to implement `malloc` in Newlib. We can play with
./run --arch aarch64 --baremetal baremetal/linker_variables.c ./run --arch aarch64 --baremetal baremetal/linker_variables.c
.... ....
=== Baremetal command line arguments
QEMU currently supports baremetal CLI arguments! TODO do it for gem5 as well.
You can see them in action e.g. with:
....
./run --arch aarch64 --baremetal userland/c/command_line_arguments.c --cli-args 'aa bb cc'
./run --arch aarch64 --userland userland/c/command_line_arguments.c --cli-args 'aa bb cc'
....
both of which output the exact same thing:
....
aa
bb
cc
....
This is implemented by parsing the command line arguments and placing them into memory where the code will find them.
This works by:
* fixing the `argc` and `argv` addresses in memory in the <<baremetal-linker-script>>
* the <<baremetal-bootloaders>> pass those addresses correctly to the call of `main`
* our Python scripts write the desired binary memory values to a file
* QEMU loads those files into memory with `-device loader`: https://github.com/qemu/qemu/blob/60905286cb5150de854e08279bca7dfc4b549e91/docs/generic-loader.txt
It is worth noting that e.g. ARM has a <<semihosting>> mechanism for loading CLI arguments through `SYS_GET_CMDLINE`, but our mechanism works in principle for any ISA.
=== Semihosting === Semihosting
Semihosting is a publicly documented interface specified by ARM Holdings that allows us to do some magic operations very useful in development. Semihosting is a publicly documented interface specified by ARM Holdings that allows us to do some magic operations very useful in development, such as writting to the terminal or reading and writing host files.
Semihosting is implemented both on some real devices and on simulators such as QEMU and <<gem5-semihosting>>.
It is documented at: https://developer.arm.com/docs/100863/latest/introduction It is documented at: https://developer.arm.com/docs/100863/latest/introduction
@@ -21323,7 +21351,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|gem5 busy loop |gem5 busy loop
|a18f28e263c91362519ef550150b5c9d75fa3679 + 1 |a18f28e263c91362519ef550150b5c9d75fa3679 + 1
|link:userland/gcc/busy_loop.c[] `-O0` |link:userland/gcc/busy_loop.c[] `-O0`
|`./run --arch aarch64 --emulator gem5 --static --userland userland/gcc/busy_loop.c --userland-args 1000000` |`./run --arch aarch64 --emulator gem5 --static --userland userland/gcc/busy_loop.c --cli-args 1000000`
|10^6 |10^6
|18 |18
|2.4005699 * 10^7 |2.4005699 * 10^7
@@ -21332,7 +21360,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|gem5 busy loop for a debug build |gem5 busy loop for a debug build
|a18f28e263c91362519ef550150b5c9d75fa3679 + 1 |a18f28e263c91362519ef550150b5c9d75fa3679 + 1
|link:userland/gcc/busy_loop.c[] `-O0` |link:userland/gcc/busy_loop.c[] `-O0`
|`./run --arch aarch64 --emulator gem5 --gem5-build-type debug --static --userland userland/gcc/busy_loop.c --userland-args 100000` |`./run --arch aarch64 --emulator gem5 --gem5-build-type debug --static --userland userland/gcc/busy_loop.c --cli-args 100000`
|10^5 |10^5
|33 |33
|2.405682 * 10^6 |2.405682 * 10^6
@@ -21341,7 +21369,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|gem5 busy loop for a fast build |gem5 busy loop for a fast build
|0d5a41a3f88fcd7ed40fc19474fe5aed0463663f + 1 |0d5a41a3f88fcd7ed40fc19474fe5aed0463663f + 1
|link:userland/gcc/busy_loop.c[] `-O0 -static` |link:userland/gcc/busy_loop.c[] `-O0 -static`
|`./run --arch aarch64 --emulator gem5 --gem5-build-type fast --static --userland userland/gcc/busy_loop.c --userland-args 1000000` |`./run --arch aarch64 --emulator gem5 --gem5-build-type fast --static --userland userland/gcc/busy_loop.c --cli-args 1000000`
|10^6 |10^6
|15 |15
|2.4005699 * 10^7 |2.4005699 * 10^7
@@ -21350,7 +21378,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|gem5 busy loop for a <<gem5-cpu-types,TimingSimpleCPU>> |gem5 busy loop for a <<gem5-cpu-types,TimingSimpleCPU>>
|a18f28e263c91362519ef550150b5c9d75fa3679 + 1 |a18f28e263c91362519ef550150b5c9d75fa3679 + 1
|link:userland/gcc/busy_loop.c[] `-O0` |link:userland/gcc/busy_loop.c[] `-O0`
|`+./run --arch aarch64 --emulator gem5 --arch aarch64 --static --userland userland/gcc/busy_loop.c --userland-args 1000000 -- --cpu-type TimingSimpleCPU --caches+` |`+./run --arch aarch64 --emulator gem5 --arch aarch64 --static --userland userland/gcc/busy_loop.c --cli-args 1000000 -- --cpu-type TimingSimpleCPU --caches+`
|10^6 |10^6
|26 |26
|2.4005699 * 10^7 |2.4005699 * 10^7
@@ -21359,7 +21387,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|gem5 busy loop for a <<gem5-cpu-types,MinorCPU>> |gem5 busy loop for a <<gem5-cpu-types,MinorCPU>>
|a18f28e263c91362519ef550150b5c9d75fa3679 + 1 |a18f28e263c91362519ef550150b5c9d75fa3679 + 1
|link:userland/gcc/busy_loop.c[] `-O0` |link:userland/gcc/busy_loop.c[] `-O0`
|`+./run --arch aarch64 --emulator gem5 --arch aarch64 --userland userland/gcc/busy_loop.c --userland-args 1000000 -- --cpu-type MinorCPU --caches+` |`+./run --arch aarch64 --emulator gem5 --arch aarch64 --userland userland/gcc/busy_loop.c --cli-args 1000000 -- --cpu-type MinorCPU --caches+`
|10^6 |10^6
|31 |31
|1.1018152 * 10^7 |1.1018152 * 10^7
@@ -21395,7 +21423,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
| |
|5d233f2664a78789f9907d27e2a40e86cefad595 |5d233f2664a78789f9907d27e2a40e86cefad595
|<<stream-benchmark>> `-O3` |<<stream-benchmark>> `-O3`
|`./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --userland-args 1000000 --trace ExecAll` |`./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --cli-args 1000000 --trace ExecAll`
|3 * 10^5 * 2 |3 * 10^5 * 2
|64 |64
|9.9674773 * 10^7 |9.9674773 * 10^7
@@ -21404,7 +21432,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|glibc C pre-main effects |glibc C pre-main effects
|ab6f7331406b22f8ab6e2df5f8b8e464fb35b611 |ab6f7331406b22f8ab6e2df5f8b8e464fb35b611
|link:userland/c/m5ops.c[] `-O0` |link:userland/c/m5ops.c[] `-O0`
|`gem5 --arch aarch64 --userland-args e` |`gem5 --arch aarch64 --cli-args e`
|1 |1
|2 |2
|1.26479 * 10^5 |1.26479 * 10^5
@@ -21413,7 +21441,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
| |
|ab6f7331406b22f8ab6e2df5f8b8e464fb35b611 |ab6f7331406b22f8ab6e2df5f8b8e464fb35b611
|glibc C pre-main link:userland/c/m5ops.c[] `-O0` |glibc C pre-main link:userland/c/m5ops.c[] `-O0`
|`gem5 --arch aarch64 --userland-args e --gem5-build-type debug` |`gem5 --arch aarch64 --cli-args e --gem5-build-type debug`
|1 |1
|2 |2
|1.26479 * 10^5 |1.26479 * 10^5
@@ -21422,7 +21450,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
| |
|ab6f7331406b22f8ab6e2df5f8b8e464fb35b611 |ab6f7331406b22f8ab6e2df5f8b8e464fb35b611
|glibc C++ pre-main link:userland/cpp/m5ops.cpp[] `-O0` |glibc C++ pre-main link:userland/cpp/m5ops.cpp[] `-O0`
|`gem5 --arch aarch64 --userland-args e` |`gem5 --arch aarch64 --cli-args e`
|1 |1
|2 |2
|2.385012 * 10^6 |2.385012 * 10^6
@@ -21431,7 +21459,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
| |
|ab6f7331406b22f8ab6e2df5f8b8e464fb35b611 |ab6f7331406b22f8ab6e2df5f8b8e464fb35b611
|glibc C++ pre-main link:userland/cpp/m5ops.cpp[] `-O0` |glibc C++ pre-main link:userland/cpp/m5ops.cpp[] `-O0`
|`gem5 --arch aarch64 --userland-args e --gem5-build-type debug` |`gem5 --arch aarch64 --cli-args e --gem5-build-type debug`
|1 |1
|25 |25
|2.385012 * 10^6 |2.385012 * 10^6
@@ -21458,7 +21486,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|Check the effect of an ExecAll log (log every instruction) on execution time, compare to analogous run without it. `trace.txt` size: 3.5GB. 5x slowdown observed with output to a hard disk. |Check the effect of an ExecAll log (log every instruction) on execution time, compare to analogous run without it. `trace.txt` size: 3.5GB. 5x slowdown observed with output to a hard disk.
|d29a07ddad499f273cc90dd66e40f8474b5dfc40 |d29a07ddad499f273cc90dd66e40f8474b5dfc40
|link:userland/gcc/busy_loop.c[] `-O0` |link:userland/gcc/busy_loop.c[] `-O0`
|`./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --userland-args 1000000 --gem5-worktree master --trace ExecAll` |`./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --cli-args 1000000 --gem5-worktree master --trace ExecAll`
|10^6 |10^6
|2.4106774 * 10^7 |2.4106774 * 10^7
|136 |136
@@ -21467,7 +21495,7 @@ Summary of manually collected results on <<p51>> at LKMC a18f28e263c91362519ef55
|Same as above but with run command manually hacked to output to a ramfs. Slightly faster, but the bulk was still just in log format operations! |Same as above but with run command manually hacked to output to a ramfs. Slightly faster, but the bulk was still just in log format operations!
|d29a07ddad499f273cc90dd66e40f8474b5dfc40 |d29a07ddad499f273cc90dd66e40f8474b5dfc40
|link:userland/gcc/busy_loop.c[] `-O0` |link:userland/gcc/busy_loop.c[] `-O0`
|`./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --userland-args 1000000 --gem5-worktree master --trace ExecAll` |`./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --cli-args 1000000 --gem5-worktree master --trace ExecAll`
|10^6 |10^6
|2.4106774 * 10^7 |2.4106774 * 10^7
|107 |107
@@ -21480,7 +21508,7 @@ The first step is to determine a number of loops that will run long enough to ha
On our <<p51>> machine, we found 10^7 (10 million == 1000 times 10000) loops to be a good number for a gem5 atomic simulation: On our <<p51>> machine, we found 10^7 (10 million == 1000 times 10000) loops to be a good number for a gem5 atomic simulation:
.... ....
./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --userland-args '1 10000000' ./run --arch aarch64 --emulator gem5 --userland userland/gcc/busy_loop.c --cli-args '1 10000000'
./gem5-stat --arch aarch64 sim_insts ./gem5-stat --arch aarch64 sim_insts
.... ....
@@ -21583,7 +21611,7 @@ time \
--arch arm \ --arch arm \
--emulator gem5 \ --emulator gem5 \
--userland "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2/dhrystone" \ --userland "$(./getvar --arch arm buildroot_build_build_dir)/dhrystone-2/dhrystone" \
--userland-args 'asdf qwer' \ --cli-args 'asdf qwer' \
; ;
.... ....

View File

@@ -27,8 +27,10 @@ _start:
adr x0, lkmc_baremetal_on_exit_callback adr x0, lkmc_baremetal_on_exit_callback
bl on_exit bl on_exit
/* Run main. */ /* Setup CLI arguments and run main. */
mov x0, 0 ldr x0, =lkmc_argc
ldr x0, [x0]
ldr x1, =lkmc_argv
bl main bl main
/* If main returns, exit. */ /* If main returns, exit. */

View File

@@ -14,15 +14,19 @@ SECTIONS
/* Fix the addresses of everything that comes after, no matter /* Fix the addresses of everything that comes after, no matter
* the exact size of the code present in .text. This allows us to * the exact size of the code present in .text. This allows us to
* place CLI arguments in memory at a known location! */ * place CLI arguments in memory at a known location! */
/* TODO would be better like this with --section-start=.lkmc_memory= on CLI,
* so that Python controls this value, but I can't that fucking working.
* baremetal_max_size from the Python must match this offset for now.
*/
/*. = SEGMENT_START(.lkmc_memory, .);*/
. = ADDR(.text) + 0x1000000; . = ADDR(.text) + 0x1000000;
lkmc_heap_low = .; lkmc_heap_low = .;
. = . + 0x1000000; . = . + 0x1000000;
lkmc_heap_top = .; lkmc_heap_top = .;
. = . + 0x1000000; . = . + 0x1000000;
lkmc_stack_top = .; lkmc_stack_top = .;
. = . + 0x1000000;
lkmc_argv = .;
. = . + 0x4;
lkmc_argc = .; lkmc_argc = .;
. = . + 0x4;
lkmc_argv = .;
} }

View File

@@ -10,7 +10,9 @@ extern int32_t lkmc_heap_low;
int main(int argc, char **argv) { int main(int argc, char **argv) {
(void)argc; (void)argc;
(void)argv; (void)argv;
printf("&lkmc_heap_low %p\n", (void*)&lkmc_heap_low); printf("&lkmc_heap_low %p\n", (void *)&lkmc_heap_low);
printf("&lkmc_argc %p\n", (void*)&lkmc_argc); printf("&lkmc_argc %p\n", (void *)&lkmc_argc);
printf("argc %d\n", argc);
printf("argv %p\n", (void *)argv);
printf("lkmc_argc %" PRId32 "\n", lkmc_argc); printf("lkmc_argc %" PRId32 "\n", lkmc_argc);
} }

View File

@@ -56,23 +56,13 @@ Build the baremetal examples with crosstool-NG.
'-mfpu=crypto-neon-fp-armv8', LF, '-mfpu=crypto-neon-fp-armv8', LF,
]) ])
if self.env['emulator'] == 'gem5': if self.env['emulator'] == 'gem5':
if self.env['machine'] == 'VExpress_GEM5_V1':
entry_address = 0x80000000
uart_address = 0x1c090000
elif self.env['machine'] == 'RealViewPBX':
entry_address = 0x10000
uart_address = 0x10009000
else:
raise Exception('unknown machine: ' + self.env['machine'])
cc_flags.extend([ cc_flags.extend([
'-DLKMC_GEM5=1', LF, '-DLKMC_GEM5=1', LF,
'-DLKMC_M5OPS_ENABLE=1', LF, '-DLKMC_M5OPS_ENABLE=1', LF,
]) ])
else: else:
entry_address = 0x40000000
uart_address = 0x09000000
cc_flags.extend(['-D', 'LKMC_QEMU=1', LF]) cc_flags.extend(['-D', 'LKMC_QEMU=1', LF])
cc_flags.extend(['-D', 'LKMC_UART0_ADDR={:#x}'.format(uart_address), LF]) cc_flags.extend(['-D', 'LKMC_UART0_ADDR={:#x}'.format(self.env['uart_address']), LF])
cc_flags.extend(self.sh.shlex_split(self.env['ccflags'])) cc_flags.extend(self.sh.shlex_split(self.env['ccflags']))
bootloader_src = os.path.join( bootloader_src = os.path.join(
self.env['baremetal_source_lib_dir'], self.env['baremetal_source_lib_dir'],
@@ -101,8 +91,8 @@ Build the baremetal examples with crosstool-NG.
link=False, link=False,
) )
cc_flags.extend([ cc_flags.extend([
'-Wl,--section-start=.text={:#x}'.format(entry_address), LF, '-Wl,--section-start=.text={:#x}'.format(self.env['entry_address']), LF,
'-Wl,--section-start=.lkmc_memory={:#x}'.format(entry_address + 0x1000000), LF, '-Wl,--section-start=.lkmc_memory={:#x}'.format(self.env['entry_address'] + 0x1000000), LF,
'-T', self.env['baremetal_link_script'], LF, '-T', self.env['baremetal_link_script'], LF,
]) ])
with thread_pool.ThreadPool( with thread_pool.ThreadPool(

View File

@@ -72,6 +72,8 @@ consts['userland_source_dir'] = os.path.join(consts['root_dir'], consts['userlan
consts['userland_source_arch_dir'] = os.path.join(consts['userland_source_dir'], 'arch') consts['userland_source_arch_dir'] = os.path.join(consts['userland_source_dir'], 'arch')
consts['userland_executable_ext'] = '.out' consts['userland_executable_ext'] = '.out'
consts['baremetal_executable_ext'] = '.elf' consts['baremetal_executable_ext'] = '.elf'
consts['baremetal_max_text_size'] = 0x1000000
consts['baremetal_memory_size'] = 0x2000000
consts['include_subdir'] = consts['repo_short_id'] consts['include_subdir'] = consts['repo_short_id']
consts['include_source_dir'] = os.path.join(consts['root_dir'], consts['include_subdir']) consts['include_source_dir'] = os.path.join(consts['root_dir'], consts['include_subdir'])
consts['submodules_dir'] = os.path.join(consts['root_dir'], 'submodules') consts['submodules_dir'] = os.path.join(consts['root_dir'], 'submodules')
@@ -587,14 +589,14 @@ https://cirosantilli.com/linux-kernel-module-cheat#user-mode-static-executables
help='''\ help='''\
Run the given userland executable in user mode instead of booting the Linux kernel Run the given userland executable in user mode instead of booting the Linux kernel
in full system mode. In gem5, user mode is called Syscall Emulation (SE) mode and in full system mode. In gem5, user mode is called Syscall Emulation (SE) mode and
uses se.py. uses se.py. Path resolution is similar to --baremetal.
Path resolution is similar to --baremetal.
''' '''
) )
self.add_argument( self.add_argument(
'--userland-args', '--cli-args',
help='''\ help='''\
CLI arguments to pass to the userland executable. CLI arguments used in both --userland mode simulation, and in --baremetal. See also:
https://cirosantilli.com/linux-kernel-module-cheat#baremetal-command-line-arguments
''' '''
) )
self.add_argument( self.add_argument(
@@ -772,7 +774,9 @@ Incompatible archs are skipped.
# + # +
# We doe this because QEMU does not add all possible Cortex Axx, there are # We doe this because QEMU does not add all possible Cortex Axx, there are
# just too many, and gem5 does not allow selecting lower feature in general. # just too many, and gem5 does not allow selecting lower feature in general.
env['int_size'] = 4
if env['arch'] == 'arm': if env['arch'] == 'arm':
env['address_size'] = 4
env['armv'] = 7 env['armv'] = 7
env['buildroot_toolchain_prefix'] = 'arm-buildroot-linux-gnueabihf' env['buildroot_toolchain_prefix'] = 'arm-buildroot-linux-gnueabihf'
env['crosstool_ng_toolchain_prefix'] = 'arm-unknown-eabi' env['crosstool_ng_toolchain_prefix'] = 'arm-unknown-eabi'
@@ -781,6 +785,7 @@ Incompatible archs are skipped.
if not env['_args_given']['march']: if not env['_args_given']['march']:
env['march'] = 'armv8-a' env['march'] = 'armv8-a'
elif env['arch'] == 'aarch64': elif env['arch'] == 'aarch64':
env['address_size'] = 8
env['armv'] = 8 env['armv'] = 8
env['buildroot_toolchain_prefix'] = 'aarch64-buildroot-linux-gnu' env['buildroot_toolchain_prefix'] = 'aarch64-buildroot-linux-gnu'
env['crosstool_ng_toolchain_prefix'] = 'aarch64-unknown-elf' env['crosstool_ng_toolchain_prefix'] = 'aarch64-unknown-elf'
@@ -789,6 +794,7 @@ Incompatible archs are skipped.
if not env['_args_given']['march']: if not env['_args_given']['march']:
env['march'] = 'armv8-a+lse' env['march'] = 'armv8-a+lse'
elif env['arch'] == 'x86_64': elif env['arch'] == 'x86_64':
env['address_size'] = 8
env['crosstool_ng_toolchain_prefix'] = 'x86_64-unknown-elf' env['crosstool_ng_toolchain_prefix'] = 'x86_64-unknown-elf'
env['gem5_arch'] = 'X86' env['gem5_arch'] = 'X86'
env['buildroot_toolchain_prefix'] = 'x86_64-buildroot-linux-gnu' env['buildroot_toolchain_prefix'] = 'x86_64-buildroot-linux-gnu'
@@ -1056,6 +1062,18 @@ Incompatible archs are skipped.
self.env['baremetal_build_lib_dir'], self.env['baremetal_build_lib_dir'],
env['baremetal_syscalls_basename_noext'] + '_asm' + self.env['obj_ext'] env['baremetal_syscalls_basename_noext'] + '_asm' + self.env['obj_ext']
) )
if env['emulator'] == 'gem5':
if env['machine'] == 'VExpress_GEM5_V1':
env['entry_address'] = 0x80000000
env['uart_address'] = 0x1c090000
elif self.env['machine'] == 'RealViewPBX':
env['entry_address'] = 0x10000
env['uart_address'] = 0x10009000
else:
raise Exception('unknown machine: ' + self.env['machine'])
else:
env['entry_address'] = 0x40000000
env['uart_address']= 0x09000000
# Userland / baremetal common source. # Userland / baremetal common source.
env['common_basename_noext'] = env['repo_short_id'] env['common_basename_noext'] = env['repo_short_id']
@@ -1229,6 +1247,15 @@ lunch aosp_{}-eng
) )
) )
@staticmethod
def python_struct_int_format(size):
if size == 4:
return 'i'
elif size == 8:
return 'Q'
else:
raise 'unknown size {}'.format(size)
def get_elf_entry(self, elf_file_path): def get_elf_entry(self, elf_file_path):
readelf_header = self.sh.check_output([ readelf_header = self.sh.check_output([
self.get_toolchain_tool('readelf'), self.get_toolchain_tool('readelf'),

46
run
View File

@@ -4,6 +4,7 @@ import os
import re import re
import shlex import shlex
import shutil import shutil
import struct
import subprocess import subprocess
import sys import sys
import time import time
@@ -517,8 +518,8 @@ Extra options to append at the end of the emulator command line.
# "KeyError: 'workload'" # "KeyError: 'workload'"
'--param', 'system.cpu[0].workload[:].release = "{}"'.format(self.env['kernel_version']), LF, '--param', 'system.cpu[0].workload[:].release = "{}"'.format(self.env['kernel_version']), LF,
]) ])
if self.env['userland_args'] is not None: if self.env['cli_args'] is not None:
cmd.extend(['--options', self.env['userland_args'], LF]) cmd.extend(['--options', self.env['cli_args'], LF])
if not self.env['static']: if not self.env['static']:
for path in self.env['userland_library_redirects']: for path in self.env['userland_library_redirects']:
cmd.extend([ cmd.extend([
@@ -725,6 +726,43 @@ Extra options to append at the end of the emulator command line.
) )
if self.env['dtb'] is not None: if self.env['dtb'] is not None:
cmd.extend(['-dtb', self.env['dtb'], LF]) cmd.extend(['-dtb', self.env['dtb'], LF])
if self.env['baremetal'] is not None:
# Setup CLI arguments into a single raw binary file to be loaded into memory.
# The memory setup of that file is:
# argc
# argv[0] pointer
# argv[1] pointer
# ...
# argv[N] pointer
# argv[0][0] data
# argv[0][1] data
# ...
# argv[1][0] data
# argv[1][1] data
# ...
if self.env['cli_args'] is not None:
cli_args_split = shlex.split(self.env['cli_args'])
else:
cli_args_split = []
argc_addr = self.env['entry_address'] + self.env['baremetal_max_text_size'] + self.env['baremetal_memory_size']
argv_addr = argc_addr + self.env['int_size']
argv_data_addr = argv_addr + len(cli_args_split) * self.env['address_size']
argv_addr_data = []
argv_addr_cur = argv_data_addr
for arg in cli_args_split:
argv_addr_data.append(struct.pack('<{}'.format(self.python_struct_int_format(self.env['address_size'])), argv_addr_cur))
argv_addr_cur += len(arg) + 1
baremetal_cli_path = os.path.join(self.env['run_dir'], 'baremetal_cli.raw')
with open(baremetal_cli_path, 'wb') as f:
f.write(struct.pack('<{}'.format(self.python_struct_int_format(self.env['int_size'])), len(cli_args_split)))
f.write(b''.join(argv_addr_data))
f.write(b'\0'.join(arg.encode() for arg in cli_args_split) + b'\0')
cmd.extend([
'-device', 'loader,addr={},file={},force-raw=on'.format(
hex(argc_addr),
baremetal_cli_path,
), LF,
])
if not self.env['qemu_which'] == 'host': if not self.env['qemu_which'] == 'host':
cmd.extend(qemu_user_and_system_options) cmd.extend(qemu_user_and_system_options)
if self.env['initrd']: if self.env['initrd']:
@@ -834,8 +872,8 @@ Extra options to append at the end of the emulator command line.
if self.env['userland'] and self.env['emulator'] in ('qemu', 'native'): if self.env['userland'] and self.env['emulator'] in ('qemu', 'native'):
# The program and arguments must come at the every end of the CLI. # The program and arguments must come at the every end of the CLI.
cmd.extend([self.env['image'], LF]) cmd.extend([self.env['image'], LF])
if self.env['userland_args'] is not None: if self.env['cli_args'] is not None:
cmd.extend(self.sh.shlex_split(self.env['userland_args'])) cmd.extend(self.sh.shlex_split(self.env['cli_args']))
if debug_vm or self.env['terminal']: if debug_vm or self.env['terminal']:
out_file = None out_file = None
else: else: