From e11bc6eb0e5c8926e5612621a3581219439a96b1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ciro=20Santilli=20=E5=85=AD=E5=9B=9B=E4=BA=8B=E4=BB=B6=20?=
 =?UTF-8?q?=E6=B3=95=E8=BD=AE=E5=8A=9F?= <ciro.santilli@gmail.com>
Date: Wed, 29 Jul 2020 01:00:00 +0000
Subject: [PATCH] a0d6fa15a207cb40cd8ce090c77aa9b55d7605a6

---
 index.html | 478 +++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 444 insertions(+), 34 deletions(-)
diff --git a/index.html b/index.html
index d8388ed..3ac4aeb 100644
--- a/index.html
+++ b/index.html
@@ -2057,7 +2057,21 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
 </li>
 <li><a href="#benchmark-machines">29.3. Benchmark machines</a>
 <ul class="sectlevel3">
-<li><a href="#p51">29.3.1. P51</a></li>
+<li><a href="#p51">29.3.1. 2017 Lenovo ThinkPad P51</a>
+<ul class="sectlevel4">
+<li><a href="#p51-benchmarks">29.3.1.1. P51 benchmarks</a>
+<ul class="sectlevel5">
+<li><a href="#p51-coremark-pro">29.3.1.1.1. P51 CoreMark-Pro</a></li>
+</ul>
+</li>
+<li><a href="#p51-maintenance-history">29.3.1.2. P51 maintenance history</a></li>
+<li><a href="#intel-core-i7-7820hq-cpu">29.3.1.3. Intel Core i7-7820HQ CPU</a></li>
+<li><a href="#samsung-m471a2k43bb1-crc-16gb-dram">29.3.1.4. Samsung M471A2K43BB1-CRC 16GB DRAM</a></li>
+<li><a href="#samsung-mzvlb512hajq-000l7-512gb-ssd">29.3.1.5. Samsung MZVLB512HAJQ-000L7 512GB SSD</a></li>
+<li><a href="#seagate-st1000lm035-1rk1-1tb-hard-disk">29.3.1.6. Seagate ST1000LM035-1RK1 1TB hard disk</a></li>
+<li><a href="#nvidia-quadro-m1200-4gb-gddr5-gpu">29.3.1.7. NVIDIA Quadro M1200 4GB GDDR5 GPU</a></li>
+</ul>
+</li>
 </ul>
 </li>
 <li><a href="#benchmark-internets">29.4. Benchmark Internets</a>
@@ -18343,7 +18357,7 @@ less "$(./getvar gem5_source_dir)/src/cpu/exetrace.cc"</pre>
 <p>We can make the trace smaller by naming the trace file as <code>trace.txt.gz</code>, which enables GZIP compression, but that is not currently exposed on our scripts, since you usually just need something human readable to work on.</p>
 </div>
 <div class="paragraph">
-<p>Enabling tracing made the runtime about 4x slower on the <a href="#p51">P51</a>, with or without <code>.gz</code> compression.</p>
+<p>Enabling tracing made the runtime about 4x slower on the <a href="#p51">2017 Lenovo ThinkPad P51</a>, with or without <code>.gz</code> compression.</p>
 </div>
 <div class="paragraph">
 <p>Trace the source lines just like <a href="#trace-source-lines">for QEMU</a> with:</p>
@@ -21193,7 +21207,7 @@ system.cpu.dtb.inst_hits</pre>
 <p>and there yes, we see that the file size fell from 39MB on <code>stats.txt</code> to 3.2MB on <code>stats.m5</code>, so the increase observed previously was just due to some initial size overhead (considering the patched gem5 with no spaces in the text file).</p>
 </div>
 <div class="paragraph">
-<p>We also note however that the stat dump made the such a simulation that just loops and dumps considerably slower, from 3s to 15s on <a href="#p51">P51</a>. Fascinating, we are definitely not disk bound there.</p>
+<p>We also note however that the stat dump made the such a simulation that just loops and dumps considerably slower, from 3s to 15s on <a href="#p51">2017 Lenovo ThinkPad P51</a>. Fascinating, we are definitely not disk bound there.</p>
 </div>
 <div class="paragraph">
 <p>We enable HDF5 on the build by default with <code>USE_HDF5=1</code>. To disable it, you can add <code>USE_HDF5=0</code> to the build as in:</p>
@@ -21639,7 +21653,7 @@ xdg-open "$(./getvar --arch arm --emulator gem5 m5out_dir)/config.dot.svg"</pre>
 </div>
 </div>
 <div class="paragraph">
-<p>Sample run time: 87 minutes on <a href="#p51">P51</a> Ubuntu 20.04 gem5 872cb227fdc0b4d60acc7840889d567a6936b6e1.</p>
+<p>Sample run time: 87 minutes on <a href="#p51">2017 Lenovo ThinkPad P51</a> Ubuntu 20.04 gem5 872cb227fdc0b4d60acc7840889d567a6936b6e1.</p>
 </div>
 <div class="paragraph">
 <p>After the first run has downloaded the test binaries for you, you can speed up the process a little bit by skipping an useless SCons call:</p>
@@ -22615,7 +22629,7 @@ less o3pipeview.tmp.log</pre>
 <div class="content">
 <pre>mkdir aarch-system-201901106
 cd aarch-system-201901106
-wget http://www.gem5.org/dist/current/arm/aarch-system-201901106.tar.bz2
+wget http://dist.gem5.org/dist/current/arm/aarch-system-201901106.tar.bz2
 tar xvf aarch-system-201901106.tar.bz2
 cd ..
 ./run --arch aarch64 --emulator gem5 --linux-exec aarch-system-201901106/binaries/vmlinux.arm64</pre>
@@ -28479,7 +28493,7 @@ build/ARM/config/the_isa.hh
 <div class="content">
 <pre>git submodule update --init submodules/gensim-simulator
 sudo apt install libantlr3c-dev
-cd submodule/gensim
+cd submodule/gensim-simulator
 make</pre>
 </div>
 </div>
@@ -28525,12 +28539,12 @@ Aborted (core dumped)</pre>
 </div>
 <div class="literalblock">
 <div class="content">
-<pre>cd /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/models/armv8 &amp;&amp; \
-  /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/build/dist/bin/gensim \
-  -a /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/models/armv8/aarch64.ac \
+<pre>cd /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim-simulator/models/armv8 &amp;&amp; \
+  /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim-simulator/build/dist/bin/gensim \
+  -a /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim-simulator/models/armv8/aarch64.ac \
   -s module,arch,decode,disasm,ee_interp,ee_blockjit,jumpinfo,function,makefile \
-  -o decode.GenerateDotGraph=1,makefile.libtrace_path=/home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/support/libtrace/inc,makefile.archsim_path=/home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/archsim/inc,makefile.llvm_path=,makefile.Optimise=2,makefile.Debug=1 \
-  -t /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim/build/models/armv8/output-aarch64/</pre>
+  -o decode.GenerateDotGraph=1,makefile.libtrace_path=/home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim-simulator/support/libtrace/inc,makefile.archsim_path=/home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim-simulator/archsim/inc,makefile.llvm_path=,makefile.Optimise=2,makefile.Debug=1 \
+  -t /home/ciro/bak/git/linux-kernel-module-cheat/submodules/gensim-simulator/build/models/armv8/output-aarch64/</pre>
 </div>
 </div>
 <div class="paragraph">
@@ -28548,7 +28562,7 @@ gensim/models/armv8/isa.ac
 <p>and where <code>gensim/models/armv8/isa.ac</code> contains <code>__builtin_abs64</code> usages.</p>
 </div>
 <div class="paragraph">
-<p>GDB on <code>gensim</code> shows that the error comes from a call to <code>gci.GenerateExecuteBodyFor(body_str, *action);</code>, so it looks like there are some missing cases in <code>EmitFixedCode</code>.</p>
+<p>Rebuilding with <code>-DCMAKE_BUILD_TYPE=DEBUG</code> + GDB on <code>gensim</code> shows that the error comes from a call to <code>gci.GenerateExecuteBodyFor(body_str, *action);</code>, so it looks like there are some missing cases in <code>gensim/src/generators/GenCInterpreter/InterpreterNodeWalker.cpp</code> function <code>SSAIntrinsicStatementWalker::EmitFixedCode</code>, e.g. there should be one for <code>__builtin_abs64</code>.</p>
 </div>
 <div class="paragraph">
 <p>This is completely broken academic code! They must be using an off-tree of part of the tool and forgot to commit.</p>
@@ -29867,7 +29881,7 @@ PERL5LIB="${PERL5LIB}:." make -j `nproc` ctest</pre>
 </div>
 </div>
 <div class="paragraph">
-<p>This both builds and runs, took about 5 minutes on <a href="#p51">P51</a>, but had build failues for some reason:</p>
+<p>This both builds and runs, took about 5 minutes on <a href="#p51">2017 Lenovo ThinkPad P51</a>, but had build failues for some reason:</p>
 </div>
 <div class="literalblock">
 <div class="content">
@@ -30192,7 +30206,7 @@ mkdir -p bin/c
 <p>All examples do exactly the same thing: span N threads and loop M times in each thread incrementing a global integer.</p>
 </div>
 <div class="paragraph">
-<p>For inputs large enough, the non-synchronized examples are extremely likely to produce "wrong" results, for example on <a href="#p51">P51</a> Ubuntu 19.10 <a href="#userland-setup-getting-started-natively">native</a> with 2 threads and 10000 loops:</p>
+<p>For inputs large enough, the non-synchronized examples are extremely likely to produce "wrong" results, for example on <a href="#p51">2017 Lenovo ThinkPad P51</a> Ubuntu 19.10 <a href="#userland-setup-getting-started-natively">native</a> with 2 threads and 10000 loops:</p>
 </div>
 <div class="literalblock">
 <div class="content">
@@ -31519,7 +31533,7 @@ xdg-open bst_vs_heap_vs_hashmap_gem5.tmp.png</pre>
 <p>TODO: the gem5 simulation blows up on a tcmalloc allocation somewhere near 25k elements as of 3fdd83c2c58327d9714fa2347c724b78d7c05e2b + 1, likely linked to the extreme inefficiency of the stats collection?</p>
 </div>
 <div class="paragraph">
-<p>The cache sizes were chosen to match the host <a href="#p51">P51</a> to improve the comparison. Ideally we should also use the same standard library.</p>
+<p>The cache sizes were chosen to match the host <a href="#p51">2017 Lenovo ThinkPad P51</a> to improve the comparison. Ideally we should also use the same standard library.</p>
 </div>
 <div class="paragraph">
 <p>Note that this will take a long time, and will produce a humongous ~40Gb stats file as explained at: <a href="#gem5-only-dump-selected-stats">Section 19.9.3.2, &#8220;gem5 only dump selected stats&#8221;</a></p>
@@ -31941,10 +31955,10 @@ make TARGET=linux64 XCMD='-c4' certify-all</pre>
 </div>
 </div>
 <div class="paragraph">
-<p>This uses <code>4</code> contexts. TODO what are contexts? Is the same as threads?</p>
+<p>This uses <code>4</code> contexts. TODO what are contexts? Is the same as threads? You likely want to use <code>-c$(nproc)</code> in practice instead?</p>
 </div>
 <div class="paragraph">
-<p>Finishes in a few seconds, <a href="#p51">P51</a> results:</p>
+<p>Finishes in a few seconds, <a href="#p51">2017 Lenovo ThinkPad P51</a> results:</p>
 </div>
 <div class="literalblock">
 <div class="content">
@@ -31968,6 +31982,9 @@ CoreMark-PRO                                      18743.79    6306.76       2.97
 </div>
 </div>
 <div class="paragraph">
+<p>More sample results: <a href="#p51-coremark-pro">P51 CoreMark-Pro</a>.</p>
+</div>
+<div class="paragraph">
 <p>And scaling appears to be the ration between multicore (4 due to <code>-c4</code> and single core performance), each benchmark gets run twice with multicore and single core.</p>
 </div>
 <div class="paragraph">
@@ -32216,7 +32233,7 @@ RUN_FLAGS =</pre>
 </div>
 </div>
 <div class="paragraph">
-<p>Output for <a href="#p51">P51</a> Ubuntu 20.04:</p>
+<p>Sample output for <a href="#p51">2017 Lenovo ThinkPad P51</a> Ubuntu 20.04:</p>
 </div>
 <div class="literalblock">
 <div class="content">
@@ -32428,12 +32445,62 @@ times[3 * ntimes + k] = mysecond() - times[3 * ntimes + k];
 <p>See also: <a href="https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure" class="bare">https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure</a></p>
 </div>
 <div class="paragraph">
-<p>The LKMC usage of STREAM is analogous to that of <a href="#dhrystone">Dhrystone</a>. Build and run on QEMU <a href="#user-mode-simulation">User mode simulation</a>:</p>
+<p>Ubuntu 20.04 native build and run:</p>
 </div>
 <div class="literalblock">
 <div class="content">
 <pre>git submodule update --init submodules/stream-benchmark
-./build-stream --optimization-level 3
+cd submodules/stream-benchmark
+make
+./stream_c.exe</pre>
+</div>
+</div>
+<div class="paragraph">
+<p>Sample output:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre>-------------------------------------------------------------
+STREAM version $Revision: 5.10 $
+-------------------------------------------------------------
+This system uses 8 bytes per array element.
+-------------------------------------------------------------
+Array size = 10000000 (elements), Offset = 0 (elements)
+Memory per array = 76.3 MiB (= 0.1 GiB).
+Total memory required = 228.9 MiB (= 0.2 GiB).
+Each kernel will be executed 10 times.
+ The *best* time for each kernel (excluding the first iteration)
+ will be used to compute the reported bandwidth.
+-------------------------------------------------------------
+Number of Threads requested = 8
+Number of Threads counted = 8
+-------------------------------------------------------------
+Your clock granularity/precision appears to be 1 microseconds.
+Each test below will take on the order of 7027 microseconds.
+   (= 7027 clock ticks)
+Increase the size of the arrays if this shows that
+you are not getting at least 20 clock ticks per test.
+-------------------------------------------------------------
+WARNING -- The above is only a rough guideline.
+For best results, please be sure you know the
+precision of your system timer.
+-------------------------------------------------------------
+Function    Best Rate MB/s  Avg time     Min time     Max time
+Copy:           20123.2     0.008055     0.007951     0.008267
+Scale:          20130.4     0.008032     0.007948     0.008177
+Add:            22528.8     0.010728     0.010653     0.010867
+Triad:          22448.4     0.010826     0.010691     0.011352
+-------------------------------------------------------------
+Solution Validates: avg error less than 1.000000e-13 on all three arrays
+-------------------------------------------------------------</pre>
+</div>
+</div>
+<div class="paragraph">
+<p>The LKMC usage of STREAM is analogous to that of <a href="#dhrystone">Dhrystone</a>. Build and run on QEMU <a href="#user-mode-simulation">User mode simulation</a>:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre>./build-stream --optimization-level 3
 ./run --userland "$(./getvar userland_build_dir)/submodules/stream-benchmark/stream_c.exe"</pre>
 </div>
 </div>
@@ -35136,7 +35203,7 @@ pop %rbp</pre>
 <p>The exact data to show depends on the value of EAX, and for a few cases instructions ECX. When it depends on ECX, it is called a sub-leaf. Out test program prints <code>eax == 0</code>.</p>
 </div>
 <div class="paragraph">
-<p>On <a href="#p51">P51</a> for example the output EAX, EBX, ECX and EDX are:</p>
+<p>On <a href="#p51">2017 Lenovo ThinkPad P51</a> for example the output EAX, EBX, ECX and EDX are:</p>
 </div>
 <div class="literalblock">
 <div class="content">
@@ -35437,7 +35504,7 @@ pop %rbp</pre>
 <div class="ulist">
 <ul>
 <li>
-<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/x86_64/vfmadd132pd.S">userland/arch/x86_64/vfmadd132pd.S</a>: VFMADD132PD: "Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, add to xmm2 and put result in xmm1." TODO: but I don&#8217;t understand the manual, experimentally on <a href="#p51">P51</a> Ubuntu 19.04 host the result is stored in XMM2!</p>
+<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/arch/x86_64/vfmadd132pd.S">userland/arch/x86_64/vfmadd132pd.S</a>: VFMADD132PD: "Multiply packed double-precision floating-point values from xmm1 and xmm3/mem, add to xmm2 and put result in xmm1." TODO: but I don&#8217;t understand the manual, experimentally on <a href="#p51">2017 Lenovo ThinkPad P51</a> Ubuntu 19.04 host the result is stored in XMM2!</p>
 </li>
 </ul>
 </div>
@@ -35565,7 +35632,7 @@ taskset -c 1 ./userland/arch/x86_64/rdtscp.out | tail -n 1</pre>
 </div>
 </div>
 <div class="paragraph">
-<p>There is also the RDPID instruction that reads just the processor ID, but it appears to be very new for QEMU 4.0.0 or <a href="#p51">P51</a>, as it fails with SIGILL on both.</p>
+<p>There is also the RDPID instruction that reads just the processor ID, but it appears to be very new for QEMU 4.0.0 or <a href="#p51">2017 Lenovo ThinkPad P51</a>, as it fails with SIGILL on both.</p>
 </div>
 <div class="paragraph">
 <p>Bibliography: <a href="https://stackoverflow.com/questions/22310028/is-there-an-x86-instruction-to-tell-which-core-the-instruction-is-being-run-on/56622112#56622112" class="bare">https://stackoverflow.com/questions/22310028/is-there-an-x86-instruction-to-tell-which-core-the-instruction-is-being-run-on/56622112#56622112</a></p>
@@ -40597,7 +40664,7 @@ import /init.${ro.zygote}.rc</pre>
 <p>So currently, we are running benchmarks manually when it seems reasonable and uploading them to: <a href="https://github.com/cirosantilli/linux-kernel-module-cheat-regression" class="bare">https://github.com/cirosantilli/linux-kernel-module-cheat-regression</a></p>
 </div>
 <div class="paragraph">
-<p>All benchmarks were run on the <a href="#p51">P51</a> machine, unless stated otherwise.</p>
+<p>All benchmarks were run on the <a href="#p51">2017 Lenovo ThinkPad P51</a> machine, unless stated otherwise.</p>
 </div>
 <div class="paragraph">
 <p>Run all benchmarks and upload the results:</p>
@@ -40742,7 +40809,7 @@ instructions 124346081</pre>
 <p>TODO: aarch64 gem5 and QEMU use the same kernel, so why is the gem5 instruction count so much much higher?</p>
 </div>
 <div class="paragraph">
-<p><a href="#p51">P51</a> Ubuntu 19.10 LKMC b11e3cd9fb5df0e3fe61de28e8264bbc95ea9005 gem5 e779c19dbb51ad2f7699bd58a5c7827708e12b55 aarch64: 143s. Why huge increases from 70s on above table? Kernel size is also huge BTW: 147MB.</p>
+<p><a href="#p51">2017 Lenovo ThinkPad P51</a> Ubuntu 19.10 LKMC b11e3cd9fb5df0e3fe61de28e8264bbc95ea9005 gem5 e779c19dbb51ad2f7699bd58a5c7827708e12b55 aarch64: 143s. Why huge increases from 70s on above table? Kernel size is also huge BTW: 147MB.</p>
 </div>
 <div class="paragraph">
 <p>Note that <a href="https://gem5.atlassian.net/browse/GEM5-337" class="bare">https://gem5.atlassian.net/browse/GEM5-337</a> "ARM PAuth patch slows down Linux boot 2x from 2 minutes to 4 minutes" was already semi fixed at that point.</p>
@@ -40820,7 +40887,7 @@ instructions 124346081</pre>
 <p>For example, the simplest scalable CPU content would be an <a href="#c-busy-loop">C busy loop</a>, so let&#8217;s start by analyzing that one.</p>
 </div>
 <div class="paragraph">
-<p>Summary of manually collected results on <a href="#p51">P51</a> at LKMC a18f28e263c91362519ef550150b5c9d75fa3679 + 1: <a href="#table-busy-loop-dmips">Table 7, &#8220;Busy loop MIPS for different simulator setups&#8221;</a>. As expected, the less native/more detailed/more complex simulations are slower!</p>
+<p>Summary of manually collected results on <a href="#p51">2017 Lenovo ThinkPad P51</a> at LKMC a18f28e263c91362519ef550150b5c9d75fa3679 + 1: <a href="#table-busy-loop-dmips">Table 7, &#8220;Busy loop MIPS for different simulator setups&#8221;</a>. As expected, the less native/more detailed/more complex simulations are slower!</p>
 </div>
 <table id="table-busy-loop-dmips" class="tableblock frame-all grid-all stretch">
 <caption class="title">Table 7. Busy loop MIPS for different simulator setups</caption>
@@ -40860,7 +40927,7 @@ instructions 124346081</pre>
 <td class="tableblock halign-left valign-top"><p class="tableblock">27</p></td>
 <td class="tableblock halign-left valign-top"></td>
 <td class="tableblock halign-left valign-top"></td>
-<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#p51">P51</a></p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="#p51">2017 Lenovo ThinkPad P51</a></p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Ubuntu 20.04</p></td>
 </tr>
 <tr>
@@ -41097,7 +41164,7 @@ instructions 124346081</pre>
 <p>The first step is to determine a number of loops that will run long enough to have meaningful results, but not too long that we will get bored, so about 1 minute.</p>
 </div>
 <div class="paragraph">
-<p>On our <a href="#p51">P51</a> machine, we found 10^7 (10 million == 1000 times 10000) loops to be a good number for a gem5 atomic simulation:</p>
+<p>On our <a href="#p51">2017 Lenovo ThinkPad P51</a> machine, we found 10^7 (10 million == 1000 times 10000) loops to be a good number for a gem5 atomic simulation:</p>
 </div>
 <div class="literalblock">
 <div class="content">
@@ -41197,7 +41264,7 @@ time \
 </div>
 </div>
 <div class="paragraph">
-<p>Result on <a href="#p51">P51</a> at bad30f513c46c1b0995d3a10c0d9bc2a33dc4fa0:</p>
+<p>Result on <a href="#p51">2017 Lenovo ThinkPad P51</a> at bad30f513c46c1b0995d3a10c0d9bc2a33dc4fa0:</p>
 </div>
 <div class="ulist">
 <ul>
@@ -41337,7 +41404,7 @@ xdg-open graph-size.pdf</pre>
 <p>We will update this whenever the gem5 submodule is updated.</p>
 </div>
 <div class="paragraph">
-<p>All benchmarks done on <a href="#p51">P51</a>.</p>
+<p>All benchmarks done on <a href="#p51">2017 Lenovo ThinkPad P51</a>.</p>
 </div>
 <div class="paragraph">
 <p>Get results with:</p>
@@ -41397,7 +41464,7 @@ tail -n+1 ../linux-kernel-module-cheat-regression/*/gem5-bench-build-*.txt</pre>
 <p>and then copy the link command to a separate Bash file. Then you can time and modify it easily.</p>
 </div>
 <div class="paragraph">
-<p>Some approximate reference values on <a href="#p51">P51</a>:</p>
+<p>Some approximate reference values on <a href="#p51">2017 Lenovo ThinkPad P51</a>:</p>
 </div>
 <div class="ulist">
 <ul>
@@ -41452,12 +41519,355 @@ tail -n+1 ../linux-kernel-module-cheat-regression/*/gem5-bench-build-*.txt</pre>
 <div class="sect2">
 <h3 id="benchmark-machines"><a class="anchor" href="#benchmark-machines"></a><a class="link" href="#benchmark-machines">29.3. Benchmark machines</a></h3>
 <div class="sect3">
-<h4 id="p51"><a class="anchor" href="#p51"></a><a class="link" href="#p51">29.3.1. P51</a></h4>
+<h4 id="p51"><a class="anchor" href="#p51"></a><a class="link" href="#p51">29.3.1. 2017 Lenovo ThinkPad P51</a></h4>
 <div class="paragraph">
-<p>Lenovo ThinkPad <a href="https://www3.lenovo.com/gb/en/laptops/thinkpad/p-series/P51/p/22TP2WPWP51">P51 laptop</a> with the Latest stable Ubuntu.</p>
+<p>Serial number: TYPE 20HH-CTO1WW S/N PF-0V5V5N 17/11</p>
 </div>
 <div class="paragraph">
-<p>Full specs and benchmark scores will be maintained at the latest version of: <a href="https://github.com/cirosantilli/notes/blob/0c038b0e430d0017f12d028c6a0e7c0b99ec957f/my-hardware.adoc#thinkpad-p51" class="bare">https://github.com/cirosantilli/notes/blob/0c038b0e430d0017f12d028c6a0e7c0b99ec957f/my-hardware.adoc#thinkpad-p51</a></p>
+<p>Parts: <a href="https://support.lenovo.com/gb/en/solutions/pd105026" class="bare">https://support.lenovo.com/gb/en/solutions/pd105026</a> (<a href="https://web.archive.org/web/20200607133848/https://support.lenovo.com/gb/en/solutions/pd105026">archive</a>)</p>
+</div>
+<div class="paragraph">
+<p>Hardware maintenance manual: <a href="https://download.lenovo.com/pccbbs/mobiles_pdf/p51_hmm_en_sp40k88791_01.pdf" class="bare">https://download.lenovo.com/pccbbs/mobiles_pdf/p51_hmm_en_sp40k88791_01.pdf</a> (<a href="https://web.archive.org/web/20200607155330/https://download.lenovo.com/pccbbs/mobiles_pdf/p51_hmm_en_sp40k88791_01.pdf">archive</a>)</p>
+</div>
+<div class="paragraph">
+<p>Summary string of key hardware for copy paste:</p>
+</div>
+<div class="quoteblock">
+<blockquote>
+<div class="paragraph">
+<p>Lenovo ThinkPad P51 laptop with CPU: <a href="#intel-core-i7-7820hq-cpu">Intel Core i7-7820HQ</a> (4 cores / 8 threads, 2.90 GHz base, 8 MB cache), DRAM: 2x <a href="#samsung-m471a2k43bb1-crc-16gb-dram">Samsung M471A2K43BB1-CRC</a> (2x 16GiB, 2400 Mbps), SSD: <a href="#samsung-mzvlb512hajq-000l7-512gb-ssd">Samsung MZVLB512HAJQ-000L7</a> (512GB, 3,000 MB/s).</p>
+</div>
+</blockquote>
+</div>
+<div class="paragraph">
+<p>Further specs:</p>
+</div>
+<div class="ulist">
+<ul>
+<li>
+<p>Hard disk: <a href="#seagate-st1000lm035-1rk1-1tb-hard-disk">Seagate ST1000LM035-1RK1 1TB hard disk</a></p>
+</li>
+<li>
+<p>GPU: <a href="#nvidia-quadro-m1200-4gb-gddr5-gpu">NVIDIA Quadro M1200 4GB GDDR5 GPU</a></p>
+</li>
+<li>
+<p>Pre-installed OS:</p>
+<div class="ulist">
+<ul>
+<li>
+<p>Windows 10 Pro 64</p>
+</li>
+<li>
+<p>Windows 10 Pro 64 WE (EN/FR/DE/NL/IT)</p>
+</li>
+</ul>
+</div>
+</li>
+<li>
+<p>Display: 15.6" FHD (1920x1080), anti-glare, IPS</p>
+</li>
+<li>
+<p>With Color Sensor</p>
+</li>
+<li>
+<p>720p HD Camera with Microphone</p>
+</li>
+<li>
+<p>Keyboard with Number Pad - Euro English</p>
+</li>
+<li>
+<p>3+3BCP, Fingerprint Reader,Color Sensor</p>
+</li>
+<li>
+<p>Integrated Fingerprint Reader</p>
+</li>
+<li>
+<p>Hardware dTPM2.0 Enabled</p>
+</li>
+<li>
+<p>1TB 5400rpm HDD</p>
+</li>
+<li>
+<p>170W AC Adapter - UK(3pin)</p>
+</li>
+<li>
+<p>6 Cell Li-Polymer Battery, 90Wh</p>
+</li>
+<li>
+<p>Intel Dual Band Wireless AC(2x2) 8265, Bluetooth Version 4.1, vPro</p>
+</li>
+</ul>
+</div>
+<div class="paragraph">
+<p>Parts:</p>
+</div>
+<div class="ulist">
+<ul>
+<li>
+<p>keyboard FRU number: 01HW271 (written on part, Payton2Walter2 NBL KBD,USI,DFN according to <a href="https://support.lenovo.com/us/en/partslookup" class="bare">https://support.lenovo.com/us/en/partslookup</a> That website says 01ER981 is equivalent (Payton2Walter2 NBL KBD,USI,CHY), just different manufacturer</p>
+</li>
+</ul>
+</div>
+<div class="paragraph">
+<p>Reddit threads:</p>
+</div>
+<div class="ulist">
+<ul>
+<li>
+<p><a href="https://www.reddit.com/r/linux4noobs/comments/5zyejw/update_1604_tp_1610_boot_hangs_at_started_nvidia/" class="bare">https://www.reddit.com/r/linux4noobs/comments/5zyejw/update_1604_tp_1610_boot_hangs_at_started_nvidia/</a></p>
+</li>
+<li>
+<p><a href="https://www.reddit.com/r/Lenovo/comments/6g8m9w/ubuntu_on_lenovo_p51/" class="bare">https://www.reddit.com/r/Lenovo/comments/6g8m9w/ubuntu_on_lenovo_p51/</a></p>
+</li>
+<li>
+<p><a href="https://www.reddit.com/r/thinkpad/comments/6hi0zn/if_youre_thinking_of_running_linux_on_a_p51_read/" class="bare">https://www.reddit.com/r/thinkpad/comments/6hi0zn/if_youre_thinking_of_running_linux_on_a_p51_read/</a></p>
+</li>
+</ul>
+</div>
+<div class="sect4">
+<h5 id="p51-benchmarks"><a class="anchor" href="#p51-benchmarks"></a><a class="link" href="#p51-benchmarks">29.3.1.1. P51 benchmarks</a></h5>
+<div class="paragraph">
+<p><a href="#dhrystone">Dhrystone</a> on Ubuntu 20.04 results at <a href="#dhrystone">Dhrystone</a>.</p>
+</div>
+<div class="paragraph">
+<p><a href="#stream-benchmark">STREAM benchmark</a> on Ubuntu 20.04 results at <a href="#stream-benchmark">STREAM benchmark</a>.</p>
+</div>
+<div class="sect5">
+<h6 id="p51-coremark-pro"><a class="anchor" href="#p51-coremark-pro"></a><a class="link" href="#p51-coremark-pro">29.3.1.1.1. P51 CoreMark-Pro</a></h6>
+<div class="paragraph">
+<p><a href="#coremark">CoreMark-Pro</a> d5b4f2ba7ba31e37a5aa93423831e7d5eb933868 on Ubuntu 20.04 with <code>XCMD="-c$(nproc)"</code>:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre>                                                 MultiCore SingleCore
+Workload Name                                     (iter/s)   (iter/s)    Scaling
+----------------------------------------------- ---------- ---------- ----------
+cjpeg-rose7-preset                                  769.23     175.44       4.38
+core                                                  7.98       2.11       3.78
+linear_alg-mid-100x100-sp                           892.86     233.64       3.82
+loops-all-mid-10k-sp                                 35.84       7.58       4.73
+nnet_test                                            35.09      10.05       3.49
+parser-125k                                         125.00      20.41       6.12
+radix2-big-64k                                     3278.69     630.91       5.20
+sha-test                                            625.00     227.27       2.75
+zip-test                                            615.38     166.67       3.69
+
+MARK RESULTS TABLE
+
+Mark Name                                        MultiCore SingleCore    Scaling
+----------------------------------------------- ---------- ---------- ----------
+CoreMark-PRO                                      25016.00    6079.70       4.11</pre>
+</div>
+</div>
+</div>
+</div>
+<div class="sect4">
+<h5 id="p51-maintenance-history"><a class="anchor" href="#p51-maintenance-history"></a><a class="link" href="#p51-maintenance-history">29.3.1.2. P51 maintenance history</a></h5>
+<div class="paragraph">
+<p>Bought: 2017 for approximately 2400 pounds.</p>
+</div>
+<div class="paragraph">
+<p>Ubuntu 17.10 setup after buying it:</p>
+</div>
+<div class="ulist">
+<ul>
+<li>
+<p>partition setup: <a href="https://askubuntu.com/questions/343268/how-to-use-manual-partitioning-during-installation/976430#976430" class="bare">https://askubuntu.com/questions/343268/how-to-use-manual-partitioning-during-installation/976430#976430</a></p>
+</li>
+<li>
+<p>BIOS:</p>
+<div class="ulist">
+<ul>
+<li>
+<p>for NVIDIA driver:</p>
+</li>
+<li>
+<p>for KVM, required by Android Emulator: enable virtualization extensions</p>
+</li>
+</ul>
+</div>
+</li>
+<li>
+<p>TODO fix the brightness keys:</p>
+<div class="ulist">
+<ul>
+<li>
+<p>failed: <a href="https://askubuntu.com/questions/769006/brightness-key-not-working-ubuntu-16-04-lts/770100#770100" class="bare">https://askubuntu.com/questions/769006/brightness-key-not-working-ubuntu-16-04-lts/770100#770100</a></p>
+</li>
+</ul>
+</div>
+</li>
+</ul>
+</div>
+<div class="paragraph">
+<p>Battery life shown by Ubuntu battery app after installation:</p>
+</div>
+<div class="ulist">
+<ul>
+<li>
+<p>before NVIDIA driver setup: 8h</p>
+</li>
+<li>
+<p>after: 6.5h</p>
+</li>
+</ul>
+</div>
+<div class="paragraph">
+<p>2019-04-17: popup asking about "ThinkPad P51 Management Engine Update" from from 182.29.3287 to 184.60.3561, said yes.</p>
+</div>
+<div class="paragraph">
+<p>2020-06-06: dropped some lemon juice on the bottom left of touchpad. Bottom left button not working anymore&#8230;&#8203; I&#8217;m an idiot. There are many other alternatives, but very aggravating, I&#8217;ll replace it for sure. Can&#8217;t find the exact replacement part or any videos showing its replacement online easliy, dang. For the T430: <a href="https://www.youtube.com/watch?v=F3lzV9uXRjU" class="bare">https://www.youtube.com/watch?v=F3lzV9uXRjU</a> Asked at: <a href="https://forums.lenovo.com/t5/ThinkPad-P-and-W-Series-Mobile-Workstations/P51-left-bottom-button-below-trackpad-mouse-left-click-stopped-working-possible-to-replace/m-p/5019903" class="bare">https://forums.lenovo.com/t5/ThinkPad-P-and-W-Series-Mobile-Workstations/P51-left-bottom-button-below-trackpad-mouse-left-click-stopped-working-possible-to-replace/m-p/5019903</a> Also I could not access it because you need to remove the HDD first: <a href="https://www.youtube.com/watch?v=5Klawxc7T_Y" class="bare">https://www.youtube.com/watch?v=5Klawxc7T_Y</a> and I can&#8217;t pull it out even with considerable force, unlike in the video&#8230;&#8203; And OMG, those button caps are impossible to re-install once removed!!! Then when I put the whole thing back together, the upper buttons were not working anymore. FUUUUUUUUCK. When first opening I pulled on it without properly removing the cap and it came off, but it didn&#8217;t look broken in any way and I put it back in. Keyboard works thank God, so right black connector is fine, left white one oppears to be the one for upper keys and trackpoint, both of which stopped working. The hardware manual confirms that they are both part of the same device, so basically a mouse :-) TODO can it be bought separately from te keyboard? Doesn&#8217;t look like it, photo of keyboard part includes those buttons. The manual also confirms that the bottom buttons are one device with the trackpad "trackpad with buttons", thus forming the second entire mouse.</p>
+</div>
+</div>
+<div class="sect4">
+<h5 id="intel-core-i7-7820hq-cpu"><a class="anchor" href="#intel-core-i7-7820hq-cpu"></a><a class="link" href="#intel-core-i7-7820hq-cpu">29.3.1.3. Intel Core i7-7820HQ CPU</a></h5>
+<div class="paragraph">
+<p><a href="https://ark.intel.com/products/97496/Intel-Core-i7-7820HQ-Processor-8M-Cache-up-to-3-90-GHz-" class="bare">https://ark.intel.com/products/97496/Intel-Core-i7-7820HQ-Processor-8M-Cache-up-to-3-90-GHz-</a> (<a href="http://web.archive.org/web/20181224203737/https://ark.intel.com/products/97496/Intel-Core-i7-7820HQ-Processor-8M-Cache-up-to-3-90-GHz-">archive</a>).</p>
+</div>
+<div class="paragraph">
+<p>Cache: 8MB</p>
+</div>
+<div class="paragraph">
+<p>Max frequency: 3.90GHz</p>
+</div>
+<div class="paragraph">
+<p>Cores: 4</p>
+</div>
+<div class="paragraph">
+<p><a href="#hardware-threads">Hardware threads</a>: 8</p>
+</div>
+<div class="paragraph">
+<p>Recommended customer price: 378.00 USD</p>
+</div>
+<div class="paragraph">
+<p>Launch date: Q1'17</p>
+</div>
+<div class="paragraph">
+<p>Process: 14 nm</p>
+</div>
+<div class="paragraph">
+<p><code>cat /proc/cpuinfo</code> of one CPU on Ubuntu 20.04 Linux kernel 5.4.0:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre>processor       : 0
+vendor_id       : GenuineIntel
+cpu family      : 6
+model           : 158
+model name      : Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+stepping        : 9
+microcode       : 0xd6
+cpu MHz         : 1025.664
+cache size      : 8192 KB
+physical id     : 0
+siblings        : 8
+core id         : 0
+cpu cores       : 4
+apicid          : 0
+initial apicid  : 0
+fpu             : yes
+fpu_exception   : yes
+cpuid level     : 22
+wp              : yes
+flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
+bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit srbds
+bogomips        : 5799.77
+clflush size    : 64
+cache_alignment : 64
+address sizes   : 39 bits physical, 48 bits virtual
+power management:</pre>
+</div>
+</div>
+<div class="paragraph">
+<p><code>getconf -a | grep CACHE</code> on Ubuntu 20.04 Linux kernel 5.4.0:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre>LEVEL1_ICACHE_SIZE                 32768
+LEVEL1_ICACHE_ASSOC                8
+LEVEL1_ICACHE_LINESIZE             64
+LEVEL1_DCACHE_SIZE                 32768
+LEVEL1_DCACHE_ASSOC                8
+LEVEL1_DCACHE_LINESIZE             64
+LEVEL2_CACHE_SIZE                  262144
+LEVEL2_CACHE_ASSOC                 4
+LEVEL2_CACHE_LINESIZE              64
+LEVEL3_CACHE_SIZE                  8388608
+LEVEL3_CACHE_ASSOC                 16
+LEVEL3_CACHE_LINESIZE              64
+LEVEL4_CACHE_SIZE                  0
+LEVEL4_CACHE_ASSOC                 0
+LEVEL4_CACHE_LINESIZE              0</pre>
+</div>
+</div>
+</div>
+<div class="sect4">
+<h5 id="samsung-m471a2k43bb1-crc-16gb-dram"><a class="anchor" href="#samsung-m471a2k43bb1-crc-16gb-dram"></a><a class="link" href="#samsung-m471a2k43bb1-crc-16gb-dram">29.3.1.4. Samsung M471A2K43BB1-CRC 16GB DRAM</a></h5>
+<div class="paragraph">
+<p>Nominal speed: 2400 Mbps</p>
+</div>
+<div class="paragraph">
+<p>Type: SODIMM</p>
+</div>
+<div class="paragraph">
+<p><a href="https://www.samsung.com/semiconductor/dram/module/M471A2K43BB1-CRC/" class="bare">https://www.samsung.com/semiconductor/dram/module/M471A2K43BB1-CRC/</a> (<a href="http://web.archive.org/web/20181224202657/https://www.samsung.com/semiconductor/dram/module/M471A2K43BB1-CRC/">archive</a>).</p>
+</div>
+<div class="paragraph">
+<p><a href="https://www.amazon.co.uk/Samsung-DDR4-16-GB-DDR4-2400-MHz-Memory-Module/dp/B016N24XKQ" class="bare">https://www.amazon.co.uk/Samsung-DDR4-16-GB-DDR4-2400-MHz-Memory-Module/dp/B016N24XKQ</a> (<a href="http://web.archive.org/web/20181224203214/https://www.amazon.co.uk/Samsung-DDR4-16-GB-DDR4-2400-MHz-Memory-Module/dp/B016N24XKQ">archive</a>) 355.43 UK Pounds for 2x 16 GiB.</p>
+</div>
+</div>
+<div class="sect4">
+<h5 id="samsung-mzvlb512hajq-000l7-512gb-ssd"><a class="anchor" href="#samsung-mzvlb512hajq-000l7-512gb-ssd"></a><a class="link" href="#samsung-mzvlb512hajq-000l7-512gb-ssd">29.3.1.5. Samsung MZVLB512HAJQ-000L7 512GB SSD</a></h5>
+<div class="paragraph">
+<p>PCIe TLC OPAL2.</p>
+</div>
+<div class="paragraph">
+<p><a href="https://www.samsung.com/semiconductor/ssd/client-ssd/MZVLB512HAJQ/" class="bare">https://www.samsung.com/semiconductor/ssd/client-ssd/MZVLB512HAJQ/</a> (<a href="http://web.archive.org/web/20181224225400/https://www.samsung.com/semiconductor/ssd/client-ssd/MZVLB512HAJQ/">archive</a>).</p>
+</div>
+<div class="paragraph">
+<p><a href="https://www.samsung.com/semiconductor/global.semi/file/resource/2018/05/PM981_M.2_SSD_Datasheet_v1.3_for_General.pdf" class="bare">https://www.samsung.com/semiconductor/global.semi/file/resource/2018/05/PM981_M.2_SSD_Datasheet_v1.3_for_General.pdf</a> | <a href="http://web.archive.org/web/20181224225410/https://www.samsung.com/semiconductor/global.semi/file/resource/2018/05/PM981_M.2_SSD_Datasheet_v1.3_for_General.pdf" class="bare">http://web.archive.org/web/20181224225410/https://www.samsung.com/semiconductor/global.semi/file/resource/2018/05/PM981_M.2_SSD_Datasheet_v1.3_for_General.pdf</a></p>
+</div>
+<div class="paragraph">
+<p><code>sudo hdparm -Tt /dev/nvme0n1p5</code> on Ubuntu 20.04:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre> Timing cached reads:   29812 MB in  1.99 seconds = 15007.00 MB/sec
+ HDIO_DRIVE_CMD(identify) failed: Inappropriate ioctl for device
+ Timing buffered disk reads: 6328 MB in  3.00 seconds = 2109.00 MB/sec</pre>
+</div>
+</div>
+<div class="paragraph">
+<p>Nominal maximum sequential read speed: 3,000 MB/s</p>
+</div>
+</div>
+<div class="sect4">
+<h5 id="seagate-st1000lm035-1rk1-1tb-hard-disk"><a class="anchor" href="#seagate-st1000lm035-1rk1-1tb-hard-disk"></a><a class="link" href="#seagate-st1000lm035-1rk1-1tb-hard-disk">29.3.1.6. Seagate ST1000LM035-1RK1 1TB hard disk</a></h5>
+<div class="paragraph">
+<p>1TB.</p>
+</div>
+<div class="paragraph">
+<p><a href="https://www.disctech.com/Seagate-ST1000LM035-1TB-SATA-Hard-Drive" class="bare">https://www.disctech.com/Seagate-ST1000LM035-1TB-SATA-Hard-Drive</a> 80 USD | <a href="http://web.archive.org/web/20181224201408/https://www.disctech.com/Seagate-ST1000LM035-1TB-SATA-Hard-Drive" class="bare">http://web.archive.org/web/20181224201408/https://www.disctech.com/Seagate-ST1000LM035-1TB-SATA-Hard-Drive</a></p>
+</div>
+<div class="paragraph">
+<p><a href="https://www.seagate.com/www-content/datasheets/pdfs/mobile-hddDS1861-2-1603-en_US.pdf" class="bare">https://www.seagate.com/www-content/datasheets/pdfs/mobile-hddDS1861-2-1603-en_US.pdf</a> | <a href="http://web.archive.org/web/20181225095438/https://www.seagate.com/www-content/datasheets/pdfs/mobile-hddDS1861-2-1603-en_US.pdf" class="bare">http://web.archive.org/web/20181225095438/https://www.seagate.com/www-content/datasheets/pdfs/mobile-hddDS1861-2-1603-en_US.pdf</a></p>
+</div>
+<div class="paragraph">
+<p><code>sudo hdparm -Tt /dev/sda3</code> on Ubuntu 20.04:</p>
+</div>
+<div class="literalblock">
+<div class="content">
+<pre> Timing cached reads:   29594 MB in  1.99 seconds = 14893.89 MB/sec
+ Timing buffered disk reads: 386 MB in  3.01 seconds = 128.07 MB/sec</pre>
+</div>
+</div>
+<div class="paragraph">
+<p>Nominal maximum speed: 140MB/s</p>
+</div>
+</div>
+<div class="sect4">
+<h5 id="nvidia-quadro-m1200-4gb-gddr5-gpu"><a class="anchor" href="#nvidia-quadro-m1200-4gb-gddr5-gpu"></a><a class="link" href="#nvidia-quadro-m1200-4gb-gddr5-gpu">29.3.1.7. NVIDIA Quadro M1200 4GB GDDR5 GPU</a></h5>
+
 </div>
 </div>
 </div>