This commit is contained in:
Ciro Santilli 六四事件 法轮功
2020-05-15 01:00:00 +00:00
parent 454eea2cff
commit d5b5108218

View File

@@ -1213,7 +1213,7 @@ body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-b
<ul class="sectlevel4">
<li><a href="#gem5-basesimplecpu">19.16.1.1. gem5 <code>BaseSimpleCPU</code></a></li>
<li><a href="#gem5-minorcpu">19.16.1.2. gem5 MinorCPU</a></li>
<li><a href="#gem5-deriveo3cpu">19.16.1.3. gem5 DeriveO3CPU</a></li>
<li><a href="#gem5-derivo3cpu">19.16.1.3. gem5 DerivO3CPU</a></li>
</ul>
</li>
<li><a href="#gem5-arm-rsk">19.16.2. gem5 ARM RSK</a></li>
@@ -2925,6 +2925,9 @@ j = 0</pre>
</li>
<li>
<p>the actual command nicely, indented and with arguments broken one per line, but with continuing backslashes so you can just copy paste into a terminal</p>
<div class="paragraph">
<p>For setups that don&#8217;t support the newline e.g. <a href="#gem5-eclipse-configuration">Eclipse debugging</a>, you can turn them off with <code>--print-cmd-oneline</code></p>
</div>
</li>
<li>
<p><code>;</code>: both a valid part of the Bash command, and a visual mark the end of the command</p>
@@ -8287,7 +8290,7 @@ pid=100</pre>
</div>
</div>
<div class="paragraph">
<p>We choose <a href="#gem5-deriveo3cpu"><code>DerivO3CPU</code></a> because of the se.py assert:</p>
<p>We choose <a href="#gem5-derivo3cpu"><code>DerivO3CPU</code></a> because of the se.py assert:</p>
</div>
<div class="literalblock">
<div class="content">
@@ -21461,7 +21464,7 @@ class SystemXBar(CoherentXBar):</pre>
<p><code>BaseKvmCPU</code></p>
</li>
<li>
<p><code>BaseSimpleCPU</code></p>
<p><code>BaseSimpleCPU</code>: <a href="#gem5-basesimplecpu">gem5 <code>BaseSimpleCPU</code></a></p>
<div class="ulist">
<ul>
<li>
@@ -21474,7 +21477,7 @@ class SystemXBar(CoherentXBar):</pre>
</div>
</li>
<li>
<p><code>MinorO3CPU</code></p>
<p><code>MinorO3CPU</code>: <a href="#gem5-minorcpu">gem5 MinorCPU</a></p>
</li>
<li>
<p><code>BaseO3CPU</code></p>
@@ -21482,6 +21485,10 @@ class SystemXBar(CoherentXBar):</pre>
<ul>
<li>
<p><code>FullO3CPU</code></p>
<div class="ulist">
<ul>
<li>
<p><code>DerivO3CPU : public FullO3CPU&lt;O3CPUImpl&gt;</code>: <a href="#gem5-derivo3cpu">gem5 DerivO3CPU</a></p>
</li>
</ul>
</div>
@@ -21491,6 +21498,12 @@ class SystemXBar(CoherentXBar):</pre>
</li>
</ul>
</div>
</li>
</ul>
</div>
<div class="paragraph">
<p>From this we see that there are basically only 4 C++ CPU models in gem5: Atomic, Timing, Minor and O3. All others are basically parametrizations of those base types.</p>
</div>
<div class="sect3">
<h4 id="list-gem5-cpu-types"><a class="anchor" href="#list-gem5-cpu-types"></a><a class="link" href="#list-gem5-cpu-types">19.16.1. List gem5 CPU types</a></h4>
<div class="sect4">
@@ -21590,20 +21603,11 @@ class SystemXBar(CoherentXBar):</pre>
<p>Implemented by Pierre-Yves Péneau from LIRMM, which is a research lab in Montpellier, France, in 2017.</p>
</div>
</li>
<li>
<p><code>O3_ARM_v7a</code>: implemented by Ronald Dreslinski from the <a href="https://en.wikipedia.org/wiki/University_of_Michigan">University of Michigan</a> in 2012</p>
<div class="paragraph">
<p>Not sure why it has v7a in the name, since I believe the CPUs are just the microarchitectural implementation of any ISA, and the v8 hello world did run.</p>
</div>
<div class="paragraph">
<p>The CLI option is named slightly differently as: <code>--cpu-type O3_ARM_v7a_3</code>.</p>
</div>
</li>
</ul>
</div>
</div>
<div class="sect4">
<h5 id="gem5-deriveo3cpu"><a class="anchor" href="#gem5-deriveo3cpu"></a><a class="link" href="#gem5-deriveo3cpu">19.16.1.3. gem5 DeriveO3CPU</a></h5>
<h5 id="gem5-derivo3cpu"><a class="anchor" href="#gem5-derivo3cpu"></a><a class="link" href="#gem5-derivo3cpu">19.16.1.3. gem5 DerivO3CPU</a></h5>
<div class="paragraph">
<p>Generic out-of-order core. "O3" Stands for "Out Of Order"!</p>
</div>
@@ -21625,6 +21629,15 @@ class SystemXBar(CoherentXBar):</pre>
</blockquote>
</div>
</li>
<li>
<p><code>O3_ARM_v7a</code>: implemented by Ronald Dreslinski from the <a href="https://en.wikipedia.org/wiki/University_of_Michigan">University of Michigan</a> in 2012</p>
<div class="paragraph">
<p>Not sure why it has v7a in the name, since I believe the CPUs are just the microarchitectural implementation of any ISA, and the v8 hello world did run.</p>
</div>
<div class="paragraph">
<p>The CLI option is named slightly differently as: <code>--cpu-type O3_ARM_v7a_3</code>.</p>
</div>
</li>
</ul>
</div>
</div>
@@ -21817,7 +21830,15 @@ cd ..
</ul>
</div>
<div class="paragraph">
<p>To run and GDB step debug the executable, just copy the full command line from the output <code>./run</code>, and configure it into Eclipse.</p>
<p>To run and GDB step debug the executable, just copy the <a href="#dry-run">full command line without newlines</a> from your run command (Eclipse does not like newlines for the arguments), e.g.:</p>
</div>
<div class="literalblock">
<div class="content">
<pre>./run --emulator gem5 --print-cmd-oneline</pre>
</div>
</div>
<div class="paragraph">
<p>and configure it into Eclipse as usual.</p>
</div>
</div>
<div class="sect3">
@@ -28295,7 +28316,17 @@ There are no non-locking atomic types or atomic primitives in POSIX: <a href="ht
<p><a href="https://stackoverflow.com/questions/145270/calling-c-c-from-python/60374990#60374990" class="bare">https://stackoverflow.com/questions/145270/calling-c-c-from-python/60374990#60374990</a></p>
</div>
<div class="paragraph">
<p>pybind11 is amazingly easy to use. But it also makes your builds really slow: <a href="#pybind11-accounts-for-50-of-gem5-build-time">pybind11 accounts for 50% of gem5 build time</a>.</p>
<p>pybind11 is amazingly easy to use. But it can also make your builds really slow:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="#pybind11-accounts-for-50-of-gem5-build-time">pybind11 accounts for 50% of gem5 build time</a>. As mentioned there, if pybind11 would split everything that can go into a cpp file from the hpp (i.e. everything except templates) that could already significantly reduce build times in certain cases. This is discussed upstream at: <a href="https://github.com/pybind/pybind11/issues/708" class="bare">https://github.com/pybind/pybind11/issues/708</a></p>
</li>
<li>
<p><a href="https://discuss.pytorch.org/t/how-are-python-bindings-created/46453/2" class="bare">https://discuss.pytorch.org/t/how-are-python-bindings-created/46453/2</a></p>
</li>
</ul>
</div>
</div>
</div>
@@ -30559,6 +30590,9 @@ child after parent sleep</pre>
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/linux/sched_getcpu.c">userland/linux/sched_getcpu.c</a></p>
</li>
<li>
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/linux/getcpu.c">userland/linux/getcpu.c</a>: a wrapper close the the syscall that also returns the current NUMA node</p>
</li>
<li>
<p><a href="https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/userland/linux/sched_getcpu_barrier.c">userland/linux/sched_getcpu_barrier.c</a>: this uses a barrier to ensure that gem5 will run each thread on one separate CPU</p>
</li>
</ul>
@@ -37928,6 +37962,9 @@ tail -n+1 ../linux-kernel-module-cheat-regression/*/gem5-bench-build-*.txt</pre>
<p>Ubuntu 19.10, GCC 9.2.1, LKMC 7c6bb29bc89ec3f1056c0680c3f08bd64018a7bc, gem5 d7d9bc240615625141cd6feddbadd392457e49eb (2020-02-18), <code>./build --arch aarch64 --gem5-worktree master --no-cache</code>: 19m 33s TODO must investigate why it got so much worse.</p>
</div>
<div class="paragraph">
<p>Ubuntu 20.04, GCC 9.3.0, LKMC 6275f70ed8862d8fe4e58ca4524a6994d254be35, gem5 d9cb548d83fa81858599807f54b52e5be35a6b03 (2020-05-06), <code>./build --arch aarch64 --gem5-worktree master --no-cache</code>: 28m!!! It&#8217;s out of control.</p>
</div>
<div class="paragraph">
<p>Same but gem5 d7d9bc240615625141cd6feddbadd392457e49eb (2018-06-17) hacked with <code>-Wnoerror</code>: 11m 37s. So there was a huge regression in the last two years! We have to find it out.</p>
</div>
<div class="paragraph">