mirror of
https://github.com/cirosantilli/linux-kernel-module-cheat.git
synced 2026-01-25 03:01:36 +01:00
This commit is contained in:
266
index.html
266
index.html
@@ -1353,17 +1353,19 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
|
||||
</li>
|
||||
<li><a href="#benchmarks">21.8. Benchmarks</a>
|
||||
<ul class="sectlevel3">
|
||||
<li><a href="#dhrystone">21.8.1. Dhrystone</a></li>
|
||||
<li><a href="#stream-benchmark">21.8.2. STREAM benchmark</a></li>
|
||||
<li><a href="#parsec-benchmark">21.8.3. PARSEC benchmark</a>
|
||||
<li><a href="#boost">21.8.1. Boost</a></li>
|
||||
<li><a href="#dhrystone">21.8.2. Dhrystone</a></li>
|
||||
<li><a href="#stream-benchmark">21.8.3. STREAM benchmark</a></li>
|
||||
<li><a href="#parsec-benchmark">21.8.4. PARSEC benchmark</a>
|
||||
<ul class="sectlevel4">
|
||||
<li><a href="#parsec-benchmark-without-parsecmgmt">21.8.3.1. PARSEC benchmark without parsecmgmt</a></li>
|
||||
<li><a href="#parsec-change-the-input-size">21.8.3.2. PARSEC change the input size</a></li>
|
||||
<li><a href="#parsec-benchmark-with-parsecmgmt">21.8.3.3. PARSEC benchmark with parsecmgmt</a></li>
|
||||
<li><a href="#parsec-uninstall">21.8.3.4. PARSEC uninstall</a></li>
|
||||
<li><a href="#parsec-benchmark-hacking">21.8.3.5. PARSEC benchmark hacking</a></li>
|
||||
<li><a href="#parsec-benchmark-without-parsecmgmt">21.8.4.1. PARSEC benchmark without parsecmgmt</a></li>
|
||||
<li><a href="#parsec-change-the-input-size">21.8.4.2. PARSEC change the input size</a></li>
|
||||
<li><a href="#parsec-benchmark-with-parsecmgmt">21.8.4.3. PARSEC benchmark with parsecmgmt</a></li>
|
||||
<li><a href="#parsec-uninstall">21.8.4.4. PARSEC uninstall</a></li>
|
||||
<li><a href="#parsec-benchmark-hacking">21.8.4.5. PARSEC benchmark hacking</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#userland-libs-directory">21.8.5. userland/libs directory</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#userland-content-bibliography">21.9. Userland content bibliography</a></li>
|
||||
@@ -1668,7 +1670,7 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
|
||||
<li><a href="#arm-fadd-vs-vadd">24.6.3.2.1. ARM FADD vs VADD</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#armv8-aarch64-ld2-instruction">24.6.3.3. ARMv8 aarch64 ld2 instruction</a></li>
|
||||
<li><a href="#armv8-aarch64-ld2-instruction">24.6.3.3. ARMv8 aarch64 LD2 instruction</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#arm-simd-bibliography">24.6.4. ARM SIMD bibliography</a></li>
|
||||
@@ -1755,9 +1757,11 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
|
||||
<ul class="sectlevel4">
|
||||
<li><a href="#arm-wfe-and-sev-instructions">27.8.3.1. ARM WFE and SEV instructions</a>
|
||||
<ul class="sectlevel5">
|
||||
<li><a href="#wfe-from-userland">27.8.3.1.1. WFE from userland</a></li>
|
||||
<li><a href="#gem5-arm-wfe">27.8.3.1.2. gem5 ARM WFE</a></li>
|
||||
<li><a href="#arm-yield-instruction">27.8.3.1.3. ARM YIELD instruction</a></li>
|
||||
<li><a href="#arm-wfe-global-monitor-events">27.8.3.1.1. ARM WFE global monitor events</a></li>
|
||||
<li><a href="#wfe-from-userland">27.8.3.1.2. WFE from userland</a></li>
|
||||
<li><a href="#armv8-spinlock-pattern">27.8.3.1.3. ARMv8 spinlock pattern</a></li>
|
||||
<li><a href="#gem5-arm-wfe">27.8.3.1.4. gem5 ARM WFE</a></li>
|
||||
<li><a href="#arm-yield-instruction">27.8.3.1.5. ARM YIELD instruction</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><a href="#arm-ldaxr-and-stlxr-instructions">27.8.3.2. ARM LDAXR and STLXR instructions</a></li>
|
||||
@@ -1799,7 +1803,7 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
|
||||
</li>
|
||||
<li><a href="#benchmark-this-repo">29. Benchmark this repo</a>
|
||||
<ul class="sectlevel2">
|
||||
<li><a href="#continuous-integraion">29.1. Continuous integraion</a>
|
||||
<li><a href="#continuous-integration">29.1. Continuous integration</a>
|
||||
<ul class="sectlevel3">
|
||||
<li><a href="#travis">29.1.1. Travis</a></li>
|
||||
<li><a href="#circleci">29.1.2. CircleCI</a></li>
|
||||
@@ -3514,7 +3518,7 @@ cd userland
|
||||
</div>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>As mentioned at <a href="#user-mode-tests">User mode tests</a>, tests under <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs">userland/libs</a> require certain optional libraries to be installed, and are not built or tested by default.</p>
|
||||
<p>As mentioned at <a href="#userland-libs-directory">userland/libs directory</a>, tests under <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs">userland/libs</a> require certain optional libraries to be installed, and are not built or tested by default.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>You can install those libraries with:</p>
|
||||
@@ -7389,7 +7393,7 @@ qw er</pre>
|
||||
<p>tests that require user interaction</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>tests that take perceptible ammounts of time</p>
|
||||
<p>tests that take perceptible amounts of time</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>known bugs we didn’t have time to fix ;-)</p>
|
||||
@@ -7397,7 +7401,7 @@ qw er</pre>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Tests under <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs/">userland/libs/</a> depend on certain libraries being available on the target, e.g. <a href="#blas">BLAS</a> for <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs/openblas">userland/libs/openblas</a>. They are not run by default, but can be enabled with <code>--package</code> and <code>--package-all</code>.</p>
|
||||
<p>Tests under <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs/">userland/libs/</a> are only run if <code>--package</code> or <code>--package-all</code> are given as described at <a href="#userland-libs-directory">userland/libs directory</a>.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>The gem5 tests require building statically with build id <code>static</code>, see also: <a href="#gem5-syscall-emulation-mode">Section 10.7, “gem5 syscall emulation mode”</a>. TODO automate this better.</p>
|
||||
@@ -16814,7 +16818,10 @@ run
|
||||
<p>The build outputs are automatically stored in a different directories for optimized and debug builds, which prevents <code>debug</code> files from overwriting <code>opt</code> ones. Therefore, <code>--gem5-build-id</code> is not required.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>The price to pay for debuggability is high however: a Linux kernel boot was about 3x slower in QEMU and 14 times slower in gem5 debug compared to opt, see benchmarks at: <a href="#benchmark-linux-kernel-boot">Section 29.2.1, “Benchmark Linux kernel boot”</a></p>
|
||||
<p>The price to pay for debuggability is high however: a Linux kernel boot was about 3x slower in QEMU and 14 times slower in gem5 debug compared to opt, see benchmarks at: <a href="#benchmark-linux-kernel-boot">Section 29.2.1, “Benchmark Linux kernel boot”</a>.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Similar slowdowns can be observed at: <a href="#benchmark-emulators-on-userland-executables">Section 29.2.2, “Benchmark emulators on userland executables”</a>.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>When in <a href="#qemu-text-mode">QEMU text mode</a>, using <code>--debug-vm</code> makes Ctrl-C not get passed to the QEMU guest anymore: it is instead captured by GDB itself, so allow breaking. So e.g. you won’t be able to easily quit from a guest program like:</p>
|
||||
@@ -16839,7 +16846,7 @@ run
|
||||
<p>While GDB "has" this feature, it is just too broken to be usable, and so we expose the amazing Mozilla RR tool conveniently in this repo: <a href="https://stackoverflow.com/questions/1470434/how-does-reverse-debugging-work/53063242#53063242" class="bare">https://stackoverflow.com/questions/1470434/how-does-reverse-debugging-work/53063242#53063242</a></p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Before the first usage:</p>
|
||||
<p>Before the first usage setup rr with:</p>
|
||||
</div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
@@ -16856,7 +16863,17 @@ sudo sysctl -p</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>This will first run the program once until completion, and then restart the program at the very first instruction at <code>_start</code> and leave you in a GDB shell.</p>
|
||||
<p>This will:</p>
|
||||
</div>
|
||||
<div class="ulist">
|
||||
<ul>
|
||||
<li>
|
||||
<p>first run the program once until completion or crash</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>then restart the program at the very first instruction at <code>_start</code> and leave you in a GDB shell</p>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>From there, run the program until your point of interest, e.g.:</p>
|
||||
@@ -16879,6 +16896,14 @@ continue</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>The use case of <code>rr</code> is often to go to the final crash and then walk back from there, so you often want to automate running until the end after record with <code>--debug-vm-args</code> as in:</p>
|
||||
</div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre>./run --debug-vm-args='-ex continue' --debug-vm-rr --userland userland/c/hello.c</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Programs often tend to blow up in very low frames that use values passed in from higher frames. In those cases, remember that just like with forward debugging, you can’t just go:</p>
|
||||
</div>
|
||||
<div class="literalblock">
|
||||
@@ -22101,7 +22126,7 @@ make menuconfig</pre>
|
||||
<p>Also mentioned at: <a href="https://stackoverflow.com/questions/47320800/how-to-clean-only-target-in-buildroot" class="bare">https://stackoverflow.com/questions/47320800/how-to-clean-only-target-in-buildroot</a></p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>See this for a sample manual workaround: <a href="#parsec-uninstall">Section 21.8.3.4, “PARSEC uninstall”</a>.</p>
|
||||
<p>See this for a sample manual workaround: <a href="#parsec-uninstall">Section 21.8.4.4, “PARSEC uninstall”</a>.</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect2">
|
||||
@@ -22982,6 +23007,26 @@ echo 1 > /proc/sys/vm/overcommit_memory
|
||||
</ul>
|
||||
</div>
|
||||
</li>
|
||||
<li>
|
||||
<p>containers</p>
|
||||
<div class="ulist">
|
||||
<ul>
|
||||
<li>
|
||||
<p>associative</p>
|
||||
<div class="ulist">
|
||||
<ul>
|
||||
<li>
|
||||
<p><a href="#algorithms">Algorithms</a> contains a benchmark comparison of different c++ containers</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/cpp/set.cpp">userland/cpp/set.cpp</a>: <code>std::set</code> contains unique keys</p>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
@@ -24220,7 +24265,23 @@ cblas_dgemm( CblasColMajor, CblasNoTrans, CblasTrans,3,3,2 ,1, A,3, B,
|
||||
</ul>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
<h4 id="dhrystone"><a class="anchor" href="#dhrystone"></a><a class="link" href="#dhrystone">21.8.1. Dhrystone</a></h4>
|
||||
<h4 id="boost"><a class="anchor" href="#boost"></a><a class="link" href="#boost">21.8.1. Boost</a></h4>
|
||||
<div class="paragraph">
|
||||
<p><a href="https://en.wikipedia.org/wiki/Boost_(C%2B%2B_libraries" class="bare">https://en.wikipedia.org/wiki/Boost_(C%2B%2B_libraries</a>)</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs/boost">userland/libs/boost</a></p>
|
||||
</div>
|
||||
<div class="ulist">
|
||||
<ul>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs/boost/bimap.cpp">userland/libs/boost/bimap.cpp</a></p>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
<h4 id="dhrystone"><a class="anchor" href="#dhrystone"></a><a class="link" href="#dhrystone">21.8.2. Dhrystone</a></h4>
|
||||
<div class="paragraph">
|
||||
<p><a href="https://en.wikipedia.org/wiki/Dhrystone" class="bare">https://en.wikipedia.org/wiki/Dhrystone</a></p>
|
||||
</div>
|
||||
@@ -24317,7 +24378,7 @@ cblas_dgemm( CblasColMajor, CblasNoTrans, CblasTrans,3,3,2 ,1, A,3, B,
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
<h4 id="stream-benchmark"><a class="anchor" href="#stream-benchmark"></a><a class="link" href="#stream-benchmark">21.8.2. STREAM benchmark</a></h4>
|
||||
<h4 id="stream-benchmark"><a class="anchor" href="#stream-benchmark"></a><a class="link" href="#stream-benchmark">21.8.3. STREAM benchmark</a></h4>
|
||||
<div class="paragraph">
|
||||
<p><a href="http://www.cs.virginia.edu/stream/ref.html" class="bare">http://www.cs.virginia.edu/stream/ref.html</a></p>
|
||||
</div>
|
||||
@@ -24391,7 +24452,7 @@ times[3 * ntimes + k] = mysecond() - times[3 * ntimes + k];
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
<h4 id="parsec-benchmark"><a class="anchor" href="#parsec-benchmark"></a><a class="link" href="#parsec-benchmark">21.8.3. PARSEC benchmark</a></h4>
|
||||
<h4 id="parsec-benchmark"><a class="anchor" href="#parsec-benchmark"></a><a class="link" href="#parsec-benchmark">21.8.4. PARSEC benchmark</a></h4>
|
||||
<div class="paragraph">
|
||||
<p>We have ported parts of the <a href="http://parsec.cs.princeton.edu">PARSEC benchmark</a> for cross compilation at: <a href="https://github.com/cirosantilli/parsec-benchmark" class="bare">https://github.com/cirosantilli/parsec-benchmark</a> See the documentation on that repo to find out which benchmarks have been ported. Some of the benchmarks were are segfaulting, they are documented in that repo.</p>
|
||||
</div>
|
||||
@@ -24409,7 +24470,7 @@ times[3 * ntimes + k] = mysecond() - times[3 * ntimes + k];
|
||||
</ul>
|
||||
</div>
|
||||
<div class="sect4">
|
||||
<h5 id="parsec-benchmark-without-parsecmgmt"><a class="anchor" href="#parsec-benchmark-without-parsecmgmt"></a><a class="link" href="#parsec-benchmark-without-parsecmgmt">21.8.3.1. PARSEC benchmark without parsecmgmt</a></h5>
|
||||
<h5 id="parsec-benchmark-without-parsecmgmt"><a class="anchor" href="#parsec-benchmark-without-parsecmgmt"></a><a class="link" href="#parsec-benchmark-without-parsecmgmt">21.8.4.1. PARSEC benchmark without parsecmgmt</a></h5>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre>./build --arch arm --download-dependencies gem5-buildroot parsec-benchmark
|
||||
@@ -24443,7 +24504,7 @@ times[3 * ntimes + k] = mysecond() - times[3 * ntimes + k];
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect4">
|
||||
<h5 id="parsec-change-the-input-size"><a class="anchor" href="#parsec-change-the-input-size"></a><a class="link" href="#parsec-change-the-input-size">21.8.3.2. PARSEC change the input size</a></h5>
|
||||
<h5 id="parsec-change-the-input-size"><a class="anchor" href="#parsec-change-the-input-size"></a><a class="link" href="#parsec-change-the-input-size">21.8.4.2. PARSEC change the input size</a></h5>
|
||||
<div class="paragraph">
|
||||
<p>Running a benchmark of a size different than <code>test</code>, e.g. <code>simsmall</code>, requires a rebuild with:</p>
|
||||
</div>
|
||||
@@ -24507,7 +24568,7 @@ times[3 * ntimes + k] = mysecond() - times[3 * ntimes + k];
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect4">
|
||||
<h5 id="parsec-benchmark-with-parsecmgmt"><a class="anchor" href="#parsec-benchmark-with-parsecmgmt"></a><a class="link" href="#parsec-benchmark-with-parsecmgmt">21.8.3.3. PARSEC benchmark with parsecmgmt</a></h5>
|
||||
<h5 id="parsec-benchmark-with-parsecmgmt"><a class="anchor" href="#parsec-benchmark-with-parsecmgmt"></a><a class="link" href="#parsec-benchmark-with-parsecmgmt">21.8.4.3. PARSEC benchmark with parsecmgmt</a></h5>
|
||||
<div class="paragraph">
|
||||
<p>Most users won’t want to use this method because:</p>
|
||||
</div>
|
||||
@@ -24570,7 +24631,7 @@ parsecmgmt -a run -p splash2x.fmm -i test</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect4">
|
||||
<h5 id="parsec-uninstall"><a class="anchor" href="#parsec-uninstall"></a><a class="link" href="#parsec-uninstall">21.8.3.4. PARSEC uninstall</a></h5>
|
||||
<h5 id="parsec-uninstall"><a class="anchor" href="#parsec-uninstall"></a><a class="link" href="#parsec-uninstall">21.8.4.4. PARSEC uninstall</a></h5>
|
||||
<div class="paragraph">
|
||||
<p>If you want to remove PARSEC later, Buildroot doesn’t provide an automated package removal mechanism as mentioned at: <a href="#remove-buildroot-packages">Section 20.6, “Remove Buildroot packages”</a>, but the following procedure should be satisfactory:</p>
|
||||
</div>
|
||||
@@ -24588,7 +24649,7 @@ parsecmgmt -a run -p splash2x.fmm -i test</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect4">
|
||||
<h5 id="parsec-benchmark-hacking"><a class="anchor" href="#parsec-benchmark-hacking"></a><a class="link" href="#parsec-benchmark-hacking">21.8.3.5. PARSEC benchmark hacking</a></h5>
|
||||
<h5 id="parsec-benchmark-hacking"><a class="anchor" href="#parsec-benchmark-hacking"></a><a class="link" href="#parsec-benchmark-hacking">21.8.4.5. PARSEC benchmark hacking</a></h5>
|
||||
<div class="paragraph">
|
||||
<p>If you end up going inside <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/submodules/parsec-benchmark">submodules/parsec-benchmark</a> to hack up the benchmark (you will!), these tips will be helpful.</p>
|
||||
</div>
|
||||
@@ -24640,6 +24701,21 @@ git clean -xdf .</pre>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect3">
|
||||
<h4 id="userland-libs-directory"><a class="anchor" href="#userland-libs-directory"></a><a class="link" href="#userland-libs-directory">21.8.5. userland/libs directory</a></h4>
|
||||
<div class="paragraph">
|
||||
<p>Tests under <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/libs">userland/libs</a> require certain optional libraries to be installed on the target, and are not built or tested by default, you must enable them with either:</p>
|
||||
</div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre>--package <package>
|
||||
--package-all</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>See for example <a href="#blas">BLAS</a>.</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect2">
|
||||
<h3 id="userland-content-bibliography"><a class="anchor" href="#userland-content-bibliography"></a><a class="link" href="#userland-content-bibliography">21.9. Userland content bibliography</a></h3>
|
||||
@@ -29308,7 +29384,7 @@ AArch64, see Procedure Call Standard for the ARM 64-bit Architecture.</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect4">
|
||||
<h5 id="armv8-aarch64-ld2-instruction"><a class="anchor" href="#armv8-aarch64-ld2-instruction"></a><a class="link" href="#armv8-aarch64-ld2-instruction">24.6.3.3. ARMv8 aarch64 ld2 instruction</a></h5>
|
||||
<h5 id="armv8-aarch64-ld2-instruction"><a class="anchor" href="#armv8-aarch64-ld2-instruction"></a><a class="link" href="#armv8-aarch64-ld2-instruction">24.6.3.3. ARMv8 aarch64 LD2 instruction</a></h5>
|
||||
<div class="paragraph">
|
||||
<p>Example: <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/ld2.S">userland/arch/aarch64/ld2.S</a></p>
|
||||
</div>
|
||||
@@ -31048,6 +31124,9 @@ IN: main
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/freestanding/linux/wfe.S">userland/arch/aarch64/freestanding/linux/wfe.S</a></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/freestanding/linux/sevl_wfe.S">userland/arch/aarch64/freestanding/linux/sevl_wfe.S</a></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/freestanding/linux/wfe_wfe.S">userland/arch/aarch64/freestanding/linux/wfe_wfe.S</a>: run WFE twice, because gem5 390a74f59934b85d91489f8a563450d8321b602d does not sleep on the first, see also: <a href="#gem5-arm-wfe">gem5 ARM WFE</a></p>
|
||||
</li>
|
||||
<li>
|
||||
@@ -31091,9 +31170,6 @@ IN: main
|
||||
<p>and power consumption is key in ARM applications.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>SEV is not the only thing that can wake up a WFE, it is only an explicit software way to do it. Notably, global monitor operations on memory accesses of regions marked by LDAXR and STLXR instructions can also wake up a WFE sleeping core. This is done to allow spinlocks opens to automatically wake up WFE sleeping cores at free time without the need for a explicit SEV.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Quotes for the above <a href="#armarm8-db">ARMv8 architecture reference manual db</a> G1.18.1 "Wait For Event and Send Event":</p>
|
||||
</div>
|
||||
<div class="quoteblock">
|
||||
@@ -31185,7 +31261,38 @@ IN: main
|
||||
<p>For how userland spinlocks and mutexes are implemented see <a href="#userland-mutex-implementation">Userland mutex implementation</a>.</p>
|
||||
</div>
|
||||
<div class="sect5">
|
||||
<h6 id="wfe-from-userland"><a class="anchor" href="#wfe-from-userland"></a><a class="link" href="#wfe-from-userland">27.8.3.1.1. WFE from userland</a></h6>
|
||||
<h6 id="arm-wfe-global-monitor-events"><a class="anchor" href="#arm-wfe-global-monitor-events"></a><a class="link" href="#arm-wfe-global-monitor-events">27.8.3.1.1. ARM WFE global monitor events</a></h6>
|
||||
<div class="paragraph">
|
||||
<p>Examples:</p>
|
||||
</div>
|
||||
<div class="ulist">
|
||||
<ul>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/inline_asm/wfe_ldxr_stxr.cpp">userland/arch/aarch64/inline_asm/wfe_ldxr_stxr.cpp</a></p>
|
||||
</li>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/inline_asm/wfe_ldxr_str.cpp">userland/arch/aarch64/inline_asm/wfe_ldxr_str.cpp</a></p>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>SEV is not the only thing that can wake up a WFE, it is only an explicit software way to do it.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Notably, global monitor operations on memory accesses of regions marked by <a href="#arm-ldxr-and-stxr-instructions">LDAXR and STLXR instructions</a> can also wake up a WFE sleeping core.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>This is done to allow spinlocks opens to automatically wake up WFE sleeping cores at free time without the need for a explicit SEV.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>In the shown in the <code>wfe_ldxr_stxr.cpp</code> example, which can only terminate in gem5 user mode simulation because due to this event.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Note that that program still terminates when running on top of the Linux kernel as explained at: <a href="#wfe-from-userland">WFE from userland</a>.</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect5">
|
||||
<h6 id="wfe-from-userland"><a class="anchor" href="#wfe-from-userland"></a><a class="link" href="#wfe-from-userland">27.8.3.1.2. WFE from userland</a></h6>
|
||||
<div class="paragraph">
|
||||
<p>WFE and SEV are usable from userland, and are part of an efficient spinlock implementation (which userland should arguably stay away from and rather use the <a href="#futex-system-call">futex system call</a> which allow for non busy sleep instead), which maybe is not something that userland should ever tho and just stick to mutexes?</p>
|
||||
</div>
|
||||
@@ -31272,14 +31379,46 @@ IN: main
|
||||
<li>
|
||||
<p>after a few interrupt handler instructions, the first <a href="#arm-svc-instruction">ERET</a> instruction exits the handler and comes back directly to the instruction after the WFE at PC 0x400080 == 0x40007c + 4</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>the execution of the interrupt handler woke up the core that was in WFE, and it now continues normal execution past the WFE</p>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Therefore, a WFE in userland is treated much like a busy loop by the Linux kernel: the kernel does not seem to try and explicitly make up room for other processes as would happen on a futex.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>The following test checks that SEV events don’t wake up a futexes, running forever in case of success. In <a href="#gem5-syscall-emulation-multithreading">gem5 syscall emulation multithreading</a>, this is crucial to prevent deadlocks:</p>
|
||||
</div>
|
||||
<div class="ulist">
|
||||
<ul>
|
||||
<li>
|
||||
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/inline_asm/futex_sev.cpp">userland/arch/aarch64/inline_asm/futex_sev.cpp</a></p>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect5">
|
||||
<h6 id="gem5-arm-wfe"><a class="anchor" href="#gem5-arm-wfe"></a><a class="link" href="#gem5-arm-wfe">27.8.3.1.2. gem5 ARM WFE</a></h6>
|
||||
<h6 id="armv8-spinlock-pattern"><a class="anchor" href="#armv8-spinlock-pattern"></a><a class="link" href="#armv8-spinlock-pattern">27.8.3.1.3. ARMv8 spinlock pattern</a></h6>
|
||||
<div class="paragraph">
|
||||
<p><a href="http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka16277.html" class="bare">http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka16277.html</a></p>
|
||||
</div>
|
||||
<div class="literalblock">
|
||||
<div class="content">
|
||||
<pre> sev
|
||||
1: wfe
|
||||
2: ldaxr w1, [w0]
|
||||
cbnz w1, %1b
|
||||
stxr w1, w2, [w0]
|
||||
cbnz w1, %2b</pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>It is the <a href="#arm-ldxr-and-stxr-instructions">STXR</a> from the unlock on another core that automatically wakes up the spinlock afterwards: <a href="https://stackoverflow.com/questions/32276313/how-is-a-spin-lock-woken-up-in-linux-arm64" class="bare">https://stackoverflow.com/questions/32276313/how-is-a-spin-lock-woken-up-in-linux-arm64</a></p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect5">
|
||||
<h6 id="gem5-arm-wfe"><a class="anchor" href="#gem5-arm-wfe"></a><a class="link" href="#gem5-arm-wfe">27.8.3.1.4. gem5 ARM WFE</a></h6>
|
||||
<div class="paragraph">
|
||||
<p>gem5 390a74f59934b85d91489f8a563450d8321b602d does not sleep on the first WFE on either syscall emulation or full system, because the code does:</p>
|
||||
</div>
|
||||
@@ -31321,7 +31460,7 @@ IN: main
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect5">
|
||||
<h6 id="arm-yield-instruction"><a class="anchor" href="#arm-yield-instruction"></a><a class="link" href="#arm-yield-instruction">27.8.3.1.3. ARM YIELD instruction</a></h6>
|
||||
<h6 id="arm-yield-instruction"><a class="anchor" href="#arm-yield-instruction"></a><a class="link" href="#arm-yield-instruction">27.8.3.1.5. ARM YIELD instruction</a></h6>
|
||||
<div class="paragraph">
|
||||
<p><a href="https://stackoverflow.com/questions/59311066/how-does-the-arm-yield-instruction-inform-other-threads-that-they-could-start-a" class="bare">https://stackoverflow.com/questions/59311066/how-does-the-arm-yield-instruction-inform-other-threads-that-they-could-start-a</a></p>
|
||||
</div>
|
||||
@@ -32338,9 +32477,9 @@ cd -
|
||||
</div>
|
||||
</div>
|
||||
<div class="sect2">
|
||||
<h3 id="continuous-integraion"><a class="anchor" href="#continuous-integraion"></a><a class="link" href="#continuous-integraion">29.1. Continuous integraion</a></h3>
|
||||
<h3 id="continuous-integration"><a class="anchor" href="#continuous-integration"></a><a class="link" href="#continuous-integration">29.1. Continuous integration</a></h3>
|
||||
<div class="paragraph">
|
||||
<p>We have exploreed a few Continuous integration solutions.</p>
|
||||
<p>We have explored a few Continuous integration solutions.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>We haven’t setup any of them yet.</p>
|
||||
@@ -32354,7 +32493,7 @@ cd -
|
||||
<div class="sect3">
|
||||
<h4 id="circleci"><a class="anchor" href="#circleci"></a><a class="link" href="#circleci">29.1.2. CircleCI</a></h4>
|
||||
<div class="paragraph">
|
||||
<p>This setup sucessfully built gem5 on every commit: <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/.circleci/config.yml">.circleci/config.yml</a></p>
|
||||
<p>This setup successfully built gem5 on every commit: <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/.circleci/config.yml">.circleci/config.yml</a></p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>Enabling it is however blocked on: <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/issues/79" class="bare">https://github.com/cirosantilli/linux-kernel-module-cheat/issues/79</a> so we disabled the builds on the web UI.</p>
|
||||
@@ -32570,6 +32709,15 @@ instructions 124346081</pre>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">a18f28e263c91362519ef550150b5c9d75fa3679 + 1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/gcc/busy_loop.c">userland/gcc/busy_loop.c</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 --gem5-build-id debug</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">10^5</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">32</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">2.528728 * 10^6</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">0.08</p></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">a18f28e263c91362519ef550150b5c9d75fa3679 + 1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/gcc/busy_loop.c">userland/gcc/busy_loop.c</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 -- --cpu-type MinorCPU --caches</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">10^6</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">31</p></td>
|
||||
@@ -32614,7 +32762,7 @@ instructions 124346081</pre>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">ab6f7331406b22f8ab6e2df5f8b8e464fb35b611</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/c/m5ops.c">userland/c/m5ops.c</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">glibc C pre-main <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/c/m5ops.c">userland/c/m5ops.c</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 --userland-args e</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">2</p></td>
|
||||
@@ -32623,13 +32771,49 @@ instructions 124346081</pre>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">ab6f7331406b22f8ab6e2df5f8b8e464fb35b611</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/cpp/m5ops.cpp">userland/cpp/m5ops.cpp</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">glibc C pre-main <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/c/m5ops.c">userland/c/m5ops.c</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 --userland-args e --gem5-build-type debug</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">2</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1.26479 * 10^5</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">0.05</p></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">ab6f7331406b22f8ab6e2df5f8b8e464fb35b611</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">glibc C++ pre-main <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/cpp/m5ops.cpp">userland/cpp/m5ops.cpp</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 --userland-args e</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">2</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">2.385012 * 10^6</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">ab6f7331406b22f8ab6e2df5f8b8e464fb35b611</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">glibc C++ pre-main <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/cpp/m5ops.cpp">userland/cpp/m5ops.cpp</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 --userland-args e --gem5-build-type debug</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">25</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">2.385012 * 10^6</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">0.1</p></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">ab6f7331406b22f8ab6e2df5f8b8e464fb35b611</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">immediate exit <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/freestanding/linux/gem5_exit.S">userland/arch/aarch64/freestanding/linux/gem5_exit.S</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"></td>
|
||||
<td class="tableblock halign-left valign-top"></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">ab6f7331406b22f8ab6e2df5f8b8e464fb35b611</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">immediate exit <a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/aarch64/freestanding/linux/gem5_exit.S">userland/arch/aarch64/freestanding/linux/gem5_exit.S</a> <code>-O0</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>gem5 --arch aarch64 --gem5-build-type debug</code></p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"><p class="tableblock">1</p></td>
|
||||
<td class="tableblock halign-left valign-top"></td>
|
||||
<td class="tableblock halign-left valign-top"></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<div class="paragraph">
|
||||
@@ -32748,7 +32932,7 @@ instructions 124346081</pre>
|
||||
<p>First we build <a href="#dhrystone">Dhrystone</a> manually statically since dynamic linking is broken in gem5 as explained at: <a href="#gem5-syscall-emulation-mode">Section 10.7, “gem5 syscall emulation mode”</a>.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>TODO: move this section to our new custom dhrystone setup: <a href="#dhrystone">Section 21.8.1, “Dhrystone”</a>.</p>
|
||||
<p>TODO: move this section to our new custom dhrystone setup: <a href="#dhrystone">Section 21.8.2, “Dhrystone”</a>.</p>
|
||||
</div>
|
||||
<div class="paragraph">
|
||||
<p>gem5 user mode:</p>
|
||||
|
||||
Reference in New Issue
Block a user