mirror of
https://github.com/cirosantilli/linux-kernel-module-cheat.git
synced 2026-01-25 03:01:36 +01:00
start moving algorithm in
This commit is contained in:
829
README.adoc
829
README.adoc
@@ -1007,7 +1007,7 @@ Notable userland content included / moving into this repository includes:
|
||||
* <<c>>
|
||||
* <<cpp>>
|
||||
* <<posix>>
|
||||
* https://github.com/cirosantilli/algorithm-cheat TODO will be good to move here for performance analysis <<gem5-run-benchmark,with gem5>>
|
||||
* <<algorithms>>
|
||||
|
||||
==== Userland setup getting started
|
||||
|
||||
@@ -10645,6 +10645,11 @@ Now you can play a fun little game with your friends:
|
||||
* make a program that solves the computation problem, and outputs output to stdout
|
||||
* write the code that runs the correct computation in the smallest number of cycles possible
|
||||
|
||||
Interesting algorithms and benchmarks for this game are being collected at:
|
||||
|
||||
* <<algorithms>>
|
||||
* <<benchmarks>>
|
||||
|
||||
To find out why your program is slow, a good first step is to have a look at the <<gem5-m5out-stats-txt-file>>.
|
||||
|
||||
==== Skip extra benchmark instructions
|
||||
@@ -11028,386 +11033,6 @@ TODO: why doesn't this exist:
|
||||
ls /sys/devices/system/cpu/cpu0/cpufreq
|
||||
....
|
||||
|
||||
==== Interesting benchmarks
|
||||
|
||||
Buildroot built-in libraries, mostly under Libraries > Other:
|
||||
|
||||
* Armadillo `C++`: linear algebra
|
||||
* fftw: Fourier transform
|
||||
* Flann
|
||||
* GSL: various
|
||||
* liblinear
|
||||
* libspacialindex
|
||||
* libtommath
|
||||
* qhull
|
||||
|
||||
Open source but not in Buildroot:
|
||||
|
||||
* https://github.com/kozyraki/stamp transactional memory benchmarks
|
||||
|
||||
There are not yet enabled, but it should be easy to so, see: xref:add-new-buildroot-packages[xrefstyle=full]
|
||||
|
||||
===== Dhrystone
|
||||
|
||||
https://en.wikipedia.org/wiki/Dhrystone
|
||||
|
||||
Created in the 80's, it is not a representative measure of performance in modern computers anymore. It has mostly been replaced by https://en.wikipedia.org/wiki/SPECint[SPEC], which is... closed source! Unbelievable.
|
||||
|
||||
<<buildroot>> has a `dhrystone` package, but because it is so interesting to us, we decided to also build it ourselves, which allows things like static and baremetal compilation more easily.
|
||||
|
||||
Build and run on QEMU <<user-mode-simulation>>:
|
||||
|
||||
....
|
||||
git submodule update --init submodules/dhrystone
|
||||
./build-dhrystone --mode userland
|
||||
./run --userland "$(./getvar userland_build_dir)/submodules/dhrystone/dhrystone"
|
||||
....
|
||||
|
||||
Build and run on gem5 user mode:
|
||||
|
||||
....
|
||||
./build-dhrystone --mode userland --static --force-rebuild
|
||||
./run --emulator gem5 --userland "$(./getvar userland_build_dir)/submodules/dhrystone/dhrystone"
|
||||
....
|
||||
|
||||
TODO automate run more nicely.
|
||||
|
||||
Build for <<baremetal>> execution and run it in baremetal QEMU:
|
||||
|
||||
....
|
||||
# Build our Newlib stubs.
|
||||
./build-baremetal --arch aarch64
|
||||
./build-dhrystone --arch aarch64 --mode baremetal
|
||||
./run --arch aarch64 --baremetal "$(./getvar baremetal_build_dir)/submodules/dhrystone/dhrystone"
|
||||
....
|
||||
|
||||
TODO: fix the build, just need to factor out all run arguments from link:build-baremetal[] into link:common.py[] and it should just work, no missing syscalls.
|
||||
|
||||
If you really want the Buildroot package for some reason, build it with:
|
||||
|
||||
....
|
||||
./build-buildroot --config 'BR2_PACKAGE_DHRYSTONE=y'
|
||||
....
|
||||
|
||||
and run inside the guest from `PATH` with:
|
||||
|
||||
....
|
||||
dhrystone
|
||||
....
|
||||
|
||||
===== BST vs heap vs hashmap
|
||||
|
||||
TODO: move benchmark graph from link:userland/cpp/bst_vs_heap_vs_hashmap.cpp[] to link:userland/algorithm/set[].
|
||||
|
||||
The following benchmark setup works both:
|
||||
|
||||
* on host through timers + https://stackoverflow.com/questions/51952471/why-do-i-get-a-constant-instead-of-logarithmic-curve-for-an-insert-time-benchmar/51953081#51953081[granule]
|
||||
* gem5 with <<m5ops-instructions,dumpstats>>, which can get more precise results with `granule == 1`
|
||||
|
||||
It has been used to answer:
|
||||
|
||||
* BST vs heap: https://stackoverflow.com/questions/6147243/heap-vs-binary-search-tree-bst/29548834#29548834
|
||||
* `std::set`: https://stackoverflow.com/questions/2558153/what-is-the-underlying-data-structure-of-a-stl-set-in-c/51944661#51944661
|
||||
* `std::map`: https://stackoverflow.com/questions/18414579/what-data-structure-is-inside-stdmap-in-c/51945119#51945119
|
||||
|
||||
To benchmark on the host, we do:
|
||||
|
||||
....
|
||||
./build-userland-in-tree \
|
||||
--force-rebuild \
|
||||
--optimization-level 3 \
|
||||
./userland/cpp/bst_vs_heap_vs_hashmap.cpp \
|
||||
;
|
||||
./userland/cpp/bst_vs_heap_vs_hashmap.out 10000000 10000 0 | tee bst_vs_heap_vs_hashmap.dat
|
||||
gnuplot \
|
||||
-e 'input_noext="bst_vs_heap_vs_hashmap"' \
|
||||
-e 'heap_zoom_max=50' \
|
||||
-e 'hashmap_zoom_max=400' \
|
||||
./bst-vs-heap-vs-hashmap.gnuplot \
|
||||
;
|
||||
xdg-open bst_vs_heap_vs_hashmap.tmp.png
|
||||
....
|
||||
|
||||
The parameters `heap_zoom_max` and `hashmap_zoom_max` are chosen manually interactively to best showcase the regions of interest in those plots.
|
||||
|
||||
To benchmark on gem5, we first build the benchmark with <<m5ops-instructions>> enabled, and then we run it and extract the stats:
|
||||
|
||||
....
|
||||
./build-userland \
|
||||
--arch x86_64 \
|
||||
--ccflags='-DLKMC_M5OPS_ENABLE=1' \
|
||||
--force-rebuild userland/cpp/bst_vs_heap_vs_hashmap.cpp \
|
||||
--static \
|
||||
--optimization-level 3 \
|
||||
;
|
||||
./run \
|
||||
--arch x86_64 \
|
||||
--emulator gem5 \
|
||||
--static \
|
||||
--userland userland/cpp/bst_vs_heap_vs_hashmap.cpp \
|
||||
--userland-args='100000 1 0' \
|
||||
-- \
|
||||
--cpu-type=DerivO3CPU \
|
||||
--caches \
|
||||
--l2cache \
|
||||
--l1d_size=32kB \
|
||||
--l1i_size=32kB \
|
||||
--l2_size=256kB \
|
||||
--l3_size=20MB \
|
||||
;
|
||||
./bst-vs-heap-vs-hashmap-gem5-stats --arch x86_64 | tee bst_vs_heap_vs_hashmap_gem5.dat
|
||||
gnuplot \
|
||||
-e 'input_noext="bst_vs_heap_vs_hashmap_gem5"' \
|
||||
-e 'heap_zoom_max=500' \
|
||||
-e 'hashmap_zoom_max=400' \
|
||||
./bst-vs-heap-vs-hashmap.gnuplot \
|
||||
;
|
||||
xdg-open bst_vs_heap_vs_hashmap_gem5.tmp.png
|
||||
....
|
||||
|
||||
TODO: the gem5 simulation blows up on a tcmalloc allocation somewhere near 25k elements as of 3fdd83c2c58327d9714fa2347c724b78d7c05e2b + 1, likely linked to the extreme inefficiency of the stats collection?
|
||||
|
||||
The cache sizes were chosen to match the host <<p51>> to improve the comparison. Ideally we should also use the same standard library.
|
||||
|
||||
Note that this will take a long time, and will produce a humongous ~40Gb stats file as explained at: xref:gem5-only-dump-selected-stats[xrefstyle=full]
|
||||
|
||||
Sources:
|
||||
|
||||
* link:userland/cpp/bst_vs_heap_vs_hashmap.cpp[]
|
||||
* link:bst-vs-heap-vs-hashmap-gem5-stats[]
|
||||
* link:bst-vs-heap-vs-hashmap.gnuplot[]
|
||||
|
||||
===== BLAS
|
||||
|
||||
Buildroot supports it, which makes everything just trivial:
|
||||
|
||||
....
|
||||
./build-buildroot --config 'BR2_PACKAGE_OPENBLAS=y'
|
||||
./build-userland --package openblas -- userland/libs/openblas/hello.c
|
||||
./run --eval-after './libs/openblas/hello.out; echo $?'
|
||||
....
|
||||
|
||||
Outcome: the test passes:
|
||||
|
||||
....
|
||||
0
|
||||
....
|
||||
|
||||
Source: link:userland/libs/openblas/hello.c[]
|
||||
|
||||
The test performs a general matrix multiplication:
|
||||
|
||||
....
|
||||
| 1.0 -3.0 | | 1.0 2.0 1.0 | | 0.5 0.5 0.5 | | 11.0 - 9.0 5.0 |
|
||||
1 * | 2.0 4.0 | * | -3.0 4.0 -1.0 | + 2 * | 0.5 0.5 0.5 | = | - 9.0 21.0 -1.0 |
|
||||
| 1.0 -1.0 | | 0.5 0.5 0.5 | | 5.0 - 1.0 3.0 |
|
||||
....
|
||||
|
||||
This can be deduced from the Fortran interfaces at
|
||||
|
||||
....
|
||||
less "$(./getvar buildroot_build_build_dir)"/openblas-*/reference/dgemmf.f
|
||||
....
|
||||
|
||||
which we can map to our call as:
|
||||
|
||||
....
|
||||
C := alpha*op( A )*op( B ) + beta*C,
|
||||
SUBROUTINE DGEMMF( TRANA, TRANB, M,N,K, ALPHA,A,LDA,B,LDB,BETA,C,LDC)
|
||||
cblas_dgemm( CblasColMajor, CblasNoTrans, CblasTrans,3,3,2 ,1, A,3, B,3, 2 ,C,3 );
|
||||
....
|
||||
|
||||
===== Eigen
|
||||
|
||||
Header only linear algebra library with a mainline Buildroot package:
|
||||
|
||||
....
|
||||
./build-buildroot --config 'BR2_PACKAGE_EIGEN=y'
|
||||
./build-userland --package eigen -- userland/libs/eigen/hello.cpp
|
||||
....
|
||||
|
||||
Just create an array and print it:
|
||||
|
||||
....
|
||||
./run --eval-after './libs/eigen/hello.out'
|
||||
....
|
||||
|
||||
Output:
|
||||
|
||||
....
|
||||
3 -1
|
||||
2.5 1.5
|
||||
....
|
||||
|
||||
Source: link:userland/libs/eigen/hello.cpp[]
|
||||
|
||||
This example just creates a matrix and prints it out.
|
||||
|
||||
Tested on: https://github.com/cirosantilli/linux-kernel-module-cheat/commit/a4bdcf102c068762bb1ef26c591fcf71e5907525[a4bdcf102c068762bb1ef26c591fcf71e5907525]
|
||||
|
||||
===== PARSEC benchmark
|
||||
|
||||
We have ported parts of the http://parsec.cs.princeton.edu[PARSEC benchmark] for cross compilation at: https://github.com/cirosantilli/parsec-benchmark See the documentation on that repo to find out which benchmarks have been ported. Some of the benchmarks were are segfaulting, they are documented in that repo.
|
||||
|
||||
There are two ways to run PARSEC with this repo:
|
||||
|
||||
* <<parsec-benchmark-without-parsecmgmt,without `pasecmgmt`>>, most likely what you want
|
||||
* <<parsec-benchmark-with-parsecmgmt,with `pasecmgmt`>>
|
||||
|
||||
====== PARSEC benchmark without parsecmgmt
|
||||
|
||||
....
|
||||
./build --arch arm --download-dependencies gem5-buildroot parsec-benchmark
|
||||
./build-buildroot --arch arm --config 'BR2_PACKAGE_PARSEC_BENCHMARK=y'
|
||||
./run --arch arm --emulator gem5
|
||||
....
|
||||
|
||||
Once inside the guest, launch one of the `test` input sized benchmarks manually as in:
|
||||
|
||||
....
|
||||
cd /parsec/ext/splash2x/apps/fmm/run
|
||||
../inst/arm-linux.gcc/bin/fmm 1 < input_1
|
||||
....
|
||||
|
||||
To find run out how to run many of the benchmarks, have a look at the `test.sh` script of the `parse-benchmark` repo.
|
||||
|
||||
From the guest, you can also run it as:
|
||||
|
||||
....
|
||||
cd /parsec
|
||||
./test.sh
|
||||
....
|
||||
|
||||
but this might be a bit time consuming in gem5.
|
||||
|
||||
====== PARSEC change the input size
|
||||
|
||||
Running a benchmark of a size different than `test`, e.g. `simsmall`, requires a rebuild with:
|
||||
|
||||
....
|
||||
./build-buildroot \
|
||||
--arch arm \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK=y' \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK_INPUT_SIZE="simsmall"' \
|
||||
-- parsec_benchmark-reconfigure \
|
||||
;
|
||||
....
|
||||
|
||||
Large input may also require tweaking:
|
||||
|
||||
* <<br2-target-rootfs-ext2-size>> if the unpacked inputs are large
|
||||
* <<memory-size>>, unless you want to meet the OOM killer, which is admittedly kind of fun
|
||||
|
||||
`test.sh` only contains the run commands for the `test` size, and cannot be used for `simsmall`.
|
||||
|
||||
The easiest thing to do, is to https://superuser.com/questions/231002/how-can-i-search-within-the-output-buffer-of-a-tmux-shell/1253137#1253137[scroll up on the host shell] after the build, and look for a line of type:
|
||||
|
||||
....
|
||||
Running /root/linux-kernel-module-cheat/out/aarch64/buildroot/build/parsec-benchmark-custom/ext/splash2x/apps/ocean_ncp/inst/aarch64-linux.gcc/bin/ocean_ncp -n2050 -p1 -e1e-07 -r20000 -t28800
|
||||
....
|
||||
|
||||
and then tweak the command found in `test.sh` accordingly.
|
||||
|
||||
Yes, we do run the benchmarks on host just to unpack / generate inputs. They are expected fail to run since they were build for the guest instead of host, including for x86_64 guest which has a different interpreter than the host's (see `file myexecutable`).
|
||||
|
||||
The rebuild is required because we unpack input files on the host.
|
||||
|
||||
Separating input sizes also allows to create smaller images when only running the smaller benchmarks.
|
||||
|
||||
This limitation exists because `parsecmgmt` generates the input files just before running via the Bash scripts, but we can't run `parsecmgmt` on gem5 as it is too slow!
|
||||
|
||||
One option would be to do that inside the guest with QEMU.
|
||||
|
||||
Also, we can't generate all input sizes at once, because many of them have the same name and would overwrite one another...
|
||||
|
||||
PARSEC simply wasn't designed with non native machines in mind...
|
||||
|
||||
====== PARSEC benchmark with parsecmgmt
|
||||
|
||||
Most users won't want to use this method because:
|
||||
|
||||
* running the `parsecmgmt` Bash scripts takes forever before it ever starts running the actual benchmarks on gem5
|
||||
+
|
||||
Running on QEMU is feasible, but not the main use case, since QEMU cannot be used for performance measurements
|
||||
* it requires putting the full `.tar` inputs on the guest, which makes the image twice as large (1x for the `.tar`, 1x for the unpacked input files)
|
||||
|
||||
It would be awesome if it were possible to use this method, since this is what Parsec supports officially, and so:
|
||||
|
||||
* you don't have to dig into what raw command to run
|
||||
* there is an easy way to run all the benchmarks in one go to test them out
|
||||
* you can just run any of the benchmarks that you want
|
||||
|
||||
but it simply is not feasible in gem5 because it takes too long.
|
||||
|
||||
If you still want to run this, try it out with:
|
||||
|
||||
....
|
||||
./build-buildroot \
|
||||
--arch aarch64 \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK=y' \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK_PARSECMGMT=y' \
|
||||
--config 'BR2_TARGET_ROOTFS_EXT2_SIZE="3G"' \
|
||||
-- parsec_benchmark-reconfigure \
|
||||
;
|
||||
....
|
||||
|
||||
And then you can run it just as you would on the host:
|
||||
|
||||
....
|
||||
cd /parsec/
|
||||
bash
|
||||
. env.sh
|
||||
parsecmgmt -a run -p splash2x.fmm -i test
|
||||
....
|
||||
|
||||
====== PARSEC uninstall
|
||||
|
||||
If you want to remove PARSEC later, Buildroot doesn't provide an automated package removal mechanism as mentioned at: xref:remove-buildroot-packages[xrefstyle=full], but the following procedure should be satisfactory:
|
||||
|
||||
....
|
||||
rm -rf \
|
||||
"$(./getvar buildroot_download_dir)"/parsec-* \
|
||||
"$(./getvar buildroot_build_dir)"/build/parsec-* \
|
||||
"$(./getvar buildroot_build_dir)"/build/packages-file-list.txt \
|
||||
"$(./getvar buildroot_build_dir)"/images/rootfs.* \
|
||||
"$(./getvar buildroot_build_dir)"/target/parsec-* \
|
||||
;
|
||||
./build-buildroot --arch arm
|
||||
....
|
||||
|
||||
====== PARSEC benchmark hacking
|
||||
|
||||
If you end up going inside link:submodules/parsec-benchmark[] to hack up the benchmark (you will!), these tips will be helpful.
|
||||
|
||||
Buildroot was not designed to deal with large images, and currently cross rebuilds are a bit slow, due to some image generation and validation steps.
|
||||
|
||||
A few workarounds are:
|
||||
|
||||
* develop in host first as much as you can. Our PARSEC fork supports it.
|
||||
+
|
||||
If you do this, don't forget to do a:
|
||||
+
|
||||
....
|
||||
cd "$(./getvar parsec_source_dir)"
|
||||
git clean -xdf .
|
||||
....
|
||||
before going for the cross compile build.
|
||||
+
|
||||
* patch Buildroot to work well, and keep cross compiling all the way. This should be totally viable, and we should do it.
|
||||
+
|
||||
Don't forget to explicitly rebuild PARSEC with:
|
||||
+
|
||||
....
|
||||
./build-buildroot \
|
||||
--arch arm \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK=y' \
|
||||
-- parsec_benchmark-reconfigure \
|
||||
;
|
||||
....
|
||||
+
|
||||
You may also want to test if your patches are still functionally correct inside of QEMU first, which is a faster emulator.
|
||||
* sell your soul, and compile natively inside the guest. We won't do this, not only because it is evil, but also because Buildroot explicitly does not support it: https://buildroot.org/downloads/manual/manual.html#faq-no-compiler-on-target ARM employees have been known to do this: https://github.com/arm-university/arm-gem5-rsk/blob/aa3b51b175a0f3b6e75c9c856092ae0c8f2a7cdc/parsec_patches/qemu-patch.diff
|
||||
|
||||
=== gem5 kernel command line parameters
|
||||
|
||||
Analogous <<kernel-command-line-parameters,to QEMU>>:
|
||||
@@ -14209,9 +13834,7 @@ Example: link:userland/c/memory_leak.c[]
|
||||
|
||||
Maybe some day someone will use this setup to study the performance of interpreters:
|
||||
|
||||
* <<node-js>>
|
||||
|
||||
=== Node.js
|
||||
==== Node.js
|
||||
|
||||
Parent section: <<interpreted-languages>>.
|
||||
|
||||
@@ -14237,6 +13860,444 @@ Examples:
|
||||
** link:rootfs_overlay/lkmc/nodejs/file_write_read.js[]
|
||||
** link:rootfs_overlay/lkmc/nodejs/read_stdin_to_string.js[] Question: https://stackoverflow.com/questions/30441025/read-all-text-from-stdin-to-a-string
|
||||
|
||||
=== Algorithms
|
||||
|
||||
link:userland/algorithm[]
|
||||
|
||||
This is still work in progress and needs better automation, but is already a good sketch. The idea was originally started at: https://github.com/cirosantilli/algorithm-cheat
|
||||
|
||||
The key idea is that input / output pairs are present in human readable files generated either:
|
||||
|
||||
* manually for small test inputs
|
||||
* with a Python script for larger randomized tests
|
||||
|
||||
Test programs then:
|
||||
|
||||
* read input from sdtin
|
||||
* produce output to stdout
|
||||
|
||||
so that we can compare the output to the expected one.
|
||||
|
||||
This way, tests can be reused across several implementations in different languages, emulating the many multi-language programming competition websites out there.
|
||||
|
||||
For example, for a <<userland-setup-getting-started-natively,native run>> we can can run a set / sorting test:
|
||||
|
||||
....
|
||||
cd userland/algorithm/set
|
||||
./build
|
||||
|
||||
# Run with a small hand written test.
|
||||
./std_set.out < test_data/8.i > tmp.raw
|
||||
|
||||
# Extract the output from the sorted stdout, which also
|
||||
# contained some timing information.
|
||||
./parse_output output < tmp.raw > tmp.o
|
||||
|
||||
# Compare the output to the Expected one.
|
||||
cmp tmp.o test_data/8.e
|
||||
|
||||
# Same but now with a large randomly generated input.
|
||||
./generate_io
|
||||
./std_set.out < tmp.i | ./parse_output output > tmp.o
|
||||
cmp tmp.o tmp.e
|
||||
....
|
||||
|
||||
Sources:
|
||||
|
||||
* link:userland/algorithm/set/generate_input[]
|
||||
* link:userland/algorithm/set/main.hpp[]
|
||||
* link:userland/algorithm/set/parse_output[]
|
||||
* link:userland/algorithm/set/std_set.cpp[]
|
||||
* link:userland/algorithm/set/test_data/8.e[]
|
||||
* link:userland/algorithm/set/test_data/8.i[]
|
||||
|
||||
link:userland/algorithm/set/parse_output[] is needed because timing instrumentation measurements must be embedded in the program itself to allow:
|
||||
|
||||
* discounting the input reading / output writing operations from the actual "read / write to / from memory algorithm" itself
|
||||
* measuring the evolution of the benchmark mid way, e.g. to see how the current container size affects insertion time: <<bst-vs-heap-vs-hashmap>>
|
||||
|
||||
The following are also interesting Buildroot libraries that we could benchmark:
|
||||
|
||||
* Armadillo `C++`: linear algebra
|
||||
* fftw: Fourier transform
|
||||
* Flann
|
||||
* GSL: various
|
||||
* liblinear
|
||||
* libspacialindex
|
||||
* libtommath
|
||||
* qhull
|
||||
|
||||
These are good targets for <<gem5-run-benchmark,performance analysis with gem5>>, and there is some overlap between this section and <<benchmarks>>.
|
||||
|
||||
==== BST vs heap vs hashmap
|
||||
|
||||
TODO: move benchmark graph from link:userland/cpp/bst_vs_heap_vs_hashmap.cpp[] to link:userland/algorithm/set[].
|
||||
|
||||
The following benchmark setup works both:
|
||||
|
||||
* on host through timers + https://stackoverflow.com/questions/51952471/why-do-i-get-a-constant-instead-of-logarithmic-curve-for-an-insert-time-benchmar/51953081#51953081[granule]
|
||||
* gem5 with <<m5ops-instructions,dumpstats>>, which can get more precise results with `granule == 1`
|
||||
|
||||
It has been used to answer:
|
||||
|
||||
* BST vs heap: https://stackoverflow.com/questions/6147243/heap-vs-binary-search-tree-bst/29548834#29548834
|
||||
* `std::set`: https://stackoverflow.com/questions/2558153/what-is-the-underlying-data-structure-of-a-stl-set-in-c/51944661#51944661
|
||||
* `std::map`: https://stackoverflow.com/questions/18414579/what-data-structure-is-inside-stdmap-in-c/51945119#51945119
|
||||
|
||||
To benchmark on the host, we do:
|
||||
|
||||
....
|
||||
./build-userland-in-tree \
|
||||
--force-rebuild \
|
||||
--optimization-level 3 \
|
||||
./userland/cpp/bst_vs_heap_vs_hashmap.cpp \
|
||||
;
|
||||
./userland/cpp/bst_vs_heap_vs_hashmap.out 10000000 10000 0 | tee bst_vs_heap_vs_hashmap.dat
|
||||
gnuplot \
|
||||
-e 'input_noext="bst_vs_heap_vs_hashmap"' \
|
||||
-e 'heap_zoom_max=50' \
|
||||
-e 'hashmap_zoom_max=400' \
|
||||
./bst-vs-heap-vs-hashmap.gnuplot \
|
||||
;
|
||||
xdg-open bst_vs_heap_vs_hashmap.tmp.png
|
||||
....
|
||||
|
||||
The parameters `heap_zoom_max` and `hashmap_zoom_max` are chosen manually interactively to best showcase the regions of interest in those plots.
|
||||
|
||||
To benchmark on gem5, we first build the benchmark with <<m5ops-instructions>> enabled, and then we run it and extract the stats:
|
||||
|
||||
....
|
||||
./build-userland \
|
||||
--arch x86_64 \
|
||||
--ccflags='-DLKMC_M5OPS_ENABLE=1' \
|
||||
--force-rebuild userland/cpp/bst_vs_heap_vs_hashmap.cpp \
|
||||
--static \
|
||||
--optimization-level 3 \
|
||||
;
|
||||
./run \
|
||||
--arch x86_64 \
|
||||
--emulator gem5 \
|
||||
--static \
|
||||
--userland userland/cpp/bst_vs_heap_vs_hashmap.cpp \
|
||||
--userland-args='100000 1 0' \
|
||||
-- \
|
||||
--cpu-type=DerivO3CPU \
|
||||
--caches \
|
||||
--l2cache \
|
||||
--l1d_size=32kB \
|
||||
--l1i_size=32kB \
|
||||
--l2_size=256kB \
|
||||
--l3_size=20MB \
|
||||
;
|
||||
./bst-vs-heap-vs-hashmap-gem5-stats --arch x86_64 | tee bst_vs_heap_vs_hashmap_gem5.dat
|
||||
gnuplot \
|
||||
-e 'input_noext="bst_vs_heap_vs_hashmap_gem5"' \
|
||||
-e 'heap_zoom_max=500' \
|
||||
-e 'hashmap_zoom_max=400' \
|
||||
./bst-vs-heap-vs-hashmap.gnuplot \
|
||||
;
|
||||
xdg-open bst_vs_heap_vs_hashmap_gem5.tmp.png
|
||||
....
|
||||
|
||||
TODO: the gem5 simulation blows up on a tcmalloc allocation somewhere near 25k elements as of 3fdd83c2c58327d9714fa2347c724b78d7c05e2b + 1, likely linked to the extreme inefficiency of the stats collection?
|
||||
|
||||
The cache sizes were chosen to match the host <<p51>> to improve the comparison. Ideally we should also use the same standard library.
|
||||
|
||||
Note that this will take a long time, and will produce a humongous ~40Gb stats file as explained at: xref:gem5-only-dump-selected-stats[xrefstyle=full]
|
||||
|
||||
Sources:
|
||||
|
||||
* link:userland/cpp/bst_vs_heap_vs_hashmap.cpp[]
|
||||
* link:bst-vs-heap-vs-hashmap-gem5-stats[]
|
||||
* link:bst-vs-heap-vs-hashmap.gnuplot[]
|
||||
|
||||
==== BLAS
|
||||
|
||||
Buildroot supports it, which makes everything just trivial:
|
||||
|
||||
....
|
||||
./build-buildroot --config 'BR2_PACKAGE_OPENBLAS=y'
|
||||
./build-userland --package openblas -- userland/libs/openblas/hello.c
|
||||
./run --eval-after './libs/openblas/hello.out; echo $?'
|
||||
....
|
||||
|
||||
Outcome: the test passes:
|
||||
|
||||
....
|
||||
0
|
||||
....
|
||||
|
||||
Source: link:userland/libs/openblas/hello.c[]
|
||||
|
||||
The test performs a general matrix multiplication:
|
||||
|
||||
....
|
||||
| 1.0 -3.0 | | 1.0 2.0 1.0 | | 0.5 0.5 0.5 | | 11.0 - 9.0 5.0 |
|
||||
1 * | 2.0 4.0 | * | -3.0 4.0 -1.0 | + 2 * | 0.5 0.5 0.5 | = | - 9.0 21.0 -1.0 |
|
||||
| 1.0 -1.0 | | 0.5 0.5 0.5 | | 5.0 - 1.0 3.0 |
|
||||
....
|
||||
|
||||
This can be deduced from the Fortran interfaces at
|
||||
|
||||
....
|
||||
less "$(./getvar buildroot_build_build_dir)"/openblas-*/reference/dgemmf.f
|
||||
....
|
||||
|
||||
which we can map to our call as:
|
||||
|
||||
....
|
||||
C := alpha*op( A )*op( B ) + beta*C,
|
||||
SUBROUTINE DGEMMF( TRANA, TRANB, M,N,K, ALPHA,A,LDA,B,LDB,BETA,C,LDC)
|
||||
cblas_dgemm( CblasColMajor, CblasNoTrans, CblasTrans,3,3,2 ,1, A,3, B,3, 2 ,C,3 );
|
||||
....
|
||||
|
||||
==== Eigen
|
||||
|
||||
Header only linear algebra library with a mainline Buildroot package:
|
||||
|
||||
....
|
||||
./build-buildroot --config 'BR2_PACKAGE_EIGEN=y'
|
||||
./build-userland --package eigen -- userland/libs/eigen/hello.cpp
|
||||
....
|
||||
|
||||
Just create an array and print it:
|
||||
|
||||
....
|
||||
./run --eval-after './libs/eigen/hello.out'
|
||||
....
|
||||
|
||||
Output:
|
||||
|
||||
....
|
||||
3 -1
|
||||
2.5 1.5
|
||||
....
|
||||
|
||||
Source: link:userland/libs/eigen/hello.cpp[]
|
||||
|
||||
This example just creates a matrix and prints it out.
|
||||
|
||||
Tested on: https://github.com/cirosantilli/linux-kernel-module-cheat/commit/a4bdcf102c068762bb1ef26c591fcf71e5907525[a4bdcf102c068762bb1ef26c591fcf71e5907525]
|
||||
|
||||
=== Benchmarks
|
||||
|
||||
These are good targets for <<gem5-run-benchmark,performance analysis with gem5>>.
|
||||
|
||||
TODO also consider the following:
|
||||
|
||||
* https://github.com/kozyraki/stamp transactional memory benchmarks
|
||||
|
||||
==== Dhrystone
|
||||
|
||||
https://en.wikipedia.org/wiki/Dhrystone
|
||||
|
||||
Created in the 80's, it is not a representative measure of performance in modern computers anymore. It has mostly been replaced by https://en.wikipedia.org/wiki/SPECint[SPEC], which is... closed source! Unbelievable.
|
||||
|
||||
<<buildroot>> has a `dhrystone` package, but because it is so interesting to us, we decided to also build it ourselves, which allows things like static and baremetal compilation more easily.
|
||||
|
||||
Build and run on QEMU <<user-mode-simulation>>:
|
||||
|
||||
....
|
||||
git submodule update --init submodules/dhrystone
|
||||
./build-dhrystone --mode userland
|
||||
./run --userland "$(./getvar userland_build_dir)/submodules/dhrystone/dhrystone"
|
||||
....
|
||||
|
||||
Build and run on gem5 user mode:
|
||||
|
||||
....
|
||||
./build-dhrystone --mode userland --static --force-rebuild
|
||||
./run --emulator gem5 --userland "$(./getvar userland_build_dir)/submodules/dhrystone/dhrystone"
|
||||
....
|
||||
|
||||
TODO automate run more nicely.
|
||||
|
||||
Build for <<baremetal>> execution and run it in baremetal QEMU:
|
||||
|
||||
....
|
||||
# Build our Newlib stubs.
|
||||
./build-baremetal --arch aarch64
|
||||
./build-dhrystone --arch aarch64 --mode baremetal
|
||||
./run --arch aarch64 --baremetal "$(./getvar baremetal_build_dir)/submodules/dhrystone/dhrystone"
|
||||
....
|
||||
|
||||
TODO: fix the build, just need to factor out all run arguments from link:build-baremetal[] into link:common.py[] and it should just work, no missing syscalls.
|
||||
|
||||
If you really want the Buildroot package for some reason, build it with:
|
||||
|
||||
....
|
||||
./build-buildroot --config 'BR2_PACKAGE_DHRYSTONE=y'
|
||||
....
|
||||
|
||||
and run inside the guest from `PATH` with:
|
||||
|
||||
....
|
||||
dhrystone
|
||||
....
|
||||
|
||||
==== PARSEC benchmark
|
||||
|
||||
We have ported parts of the http://parsec.cs.princeton.edu[PARSEC benchmark] for cross compilation at: https://github.com/cirosantilli/parsec-benchmark See the documentation on that repo to find out which benchmarks have been ported. Some of the benchmarks were are segfaulting, they are documented in that repo.
|
||||
|
||||
There are two ways to run PARSEC with this repo:
|
||||
|
||||
* <<parsec-benchmark-without-parsecmgmt,without `pasecmgmt`>>, most likely what you want
|
||||
* <<parsec-benchmark-with-parsecmgmt,with `pasecmgmt`>>
|
||||
|
||||
===== PARSEC benchmark without parsecmgmt
|
||||
|
||||
....
|
||||
./build --arch arm --download-dependencies gem5-buildroot parsec-benchmark
|
||||
./build-buildroot --arch arm --config 'BR2_PACKAGE_PARSEC_BENCHMARK=y'
|
||||
./run --arch arm --emulator gem5
|
||||
....
|
||||
|
||||
Once inside the guest, launch one of the `test` input sized benchmarks manually as in:
|
||||
|
||||
....
|
||||
cd /parsec/ext/splash2x/apps/fmm/run
|
||||
../inst/arm-linux.gcc/bin/fmm 1 < input_1
|
||||
....
|
||||
|
||||
To find run out how to run many of the benchmarks, have a look at the `test.sh` script of the `parse-benchmark` repo.
|
||||
|
||||
From the guest, you can also run it as:
|
||||
|
||||
....
|
||||
cd /parsec
|
||||
./test.sh
|
||||
....
|
||||
|
||||
but this might be a bit time consuming in gem5.
|
||||
|
||||
===== PARSEC change the input size
|
||||
|
||||
Running a benchmark of a size different than `test`, e.g. `simsmall`, requires a rebuild with:
|
||||
|
||||
....
|
||||
./build-buildroot \
|
||||
--arch arm \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK=y' \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK_INPUT_SIZE="simsmall"' \
|
||||
-- parsec_benchmark-reconfigure \
|
||||
;
|
||||
....
|
||||
|
||||
Large input may also require tweaking:
|
||||
|
||||
* <<br2-target-rootfs-ext2-size>> if the unpacked inputs are large
|
||||
* <<memory-size>>, unless you want to meet the OOM killer, which is admittedly kind of fun
|
||||
|
||||
`test.sh` only contains the run commands for the `test` size, and cannot be used for `simsmall`.
|
||||
|
||||
The easiest thing to do, is to https://superuser.com/questions/231002/how-can-i-search-within-the-output-buffer-of-a-tmux-shell/1253137#1253137[scroll up on the host shell] after the build, and look for a line of type:
|
||||
|
||||
....
|
||||
Running /root/linux-kernel-module-cheat/out/aarch64/buildroot/build/parsec-benchmark-custom/ext/splash2x/apps/ocean_ncp/inst/aarch64-linux.gcc/bin/ocean_ncp -n2050 -p1 -e1e-07 -r20000 -t28800
|
||||
....
|
||||
|
||||
and then tweak the command found in `test.sh` accordingly.
|
||||
|
||||
Yes, we do run the benchmarks on host just to unpack / generate inputs. They are expected fail to run since they were build for the guest instead of host, including for x86_64 guest which has a different interpreter than the host's (see `file myexecutable`).
|
||||
|
||||
The rebuild is required because we unpack input files on the host.
|
||||
|
||||
Separating input sizes also allows to create smaller images when only running the smaller benchmarks.
|
||||
|
||||
This limitation exists because `parsecmgmt` generates the input files just before running via the Bash scripts, but we can't run `parsecmgmt` on gem5 as it is too slow!
|
||||
|
||||
One option would be to do that inside the guest with QEMU.
|
||||
|
||||
Also, we can't generate all input sizes at once, because many of them have the same name and would overwrite one another...
|
||||
|
||||
PARSEC simply wasn't designed with non native machines in mind...
|
||||
|
||||
===== PARSEC benchmark with parsecmgmt
|
||||
|
||||
Most users won't want to use this method because:
|
||||
|
||||
* running the `parsecmgmt` Bash scripts takes forever before it ever starts running the actual benchmarks on gem5
|
||||
+
|
||||
Running on QEMU is feasible, but not the main use case, since QEMU cannot be used for performance measurements
|
||||
* it requires putting the full `.tar` inputs on the guest, which makes the image twice as large (1x for the `.tar`, 1x for the unpacked input files)
|
||||
|
||||
It would be awesome if it were possible to use this method, since this is what Parsec supports officially, and so:
|
||||
|
||||
* you don't have to dig into what raw command to run
|
||||
* there is an easy way to run all the benchmarks in one go to test them out
|
||||
* you can just run any of the benchmarks that you want
|
||||
|
||||
but it simply is not feasible in gem5 because it takes too long.
|
||||
|
||||
If you still want to run this, try it out with:
|
||||
|
||||
....
|
||||
./build-buildroot \
|
||||
--arch aarch64 \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK=y' \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK_PARSECMGMT=y' \
|
||||
--config 'BR2_TARGET_ROOTFS_EXT2_SIZE="3G"' \
|
||||
-- parsec_benchmark-reconfigure \
|
||||
;
|
||||
....
|
||||
|
||||
And then you can run it just as you would on the host:
|
||||
|
||||
....
|
||||
cd /parsec/
|
||||
bash
|
||||
. env.sh
|
||||
parsecmgmt -a run -p splash2x.fmm -i test
|
||||
....
|
||||
|
||||
===== PARSEC uninstall
|
||||
|
||||
If you want to remove PARSEC later, Buildroot doesn't provide an automated package removal mechanism as mentioned at: xref:remove-buildroot-packages[xrefstyle=full], but the following procedure should be satisfactory:
|
||||
|
||||
....
|
||||
rm -rf \
|
||||
"$(./getvar buildroot_download_dir)"/parsec-* \
|
||||
"$(./getvar buildroot_build_dir)"/build/parsec-* \
|
||||
"$(./getvar buildroot_build_dir)"/build/packages-file-list.txt \
|
||||
"$(./getvar buildroot_build_dir)"/images/rootfs.* \
|
||||
"$(./getvar buildroot_build_dir)"/target/parsec-* \
|
||||
;
|
||||
./build-buildroot --arch arm
|
||||
....
|
||||
|
||||
===== PARSEC benchmark hacking
|
||||
|
||||
If you end up going inside link:submodules/parsec-benchmark[] to hack up the benchmark (you will!), these tips will be helpful.
|
||||
|
||||
Buildroot was not designed to deal with large images, and currently cross rebuilds are a bit slow, due to some image generation and validation steps.
|
||||
|
||||
A few workarounds are:
|
||||
|
||||
* develop in host first as much as you can. Our PARSEC fork supports it.
|
||||
+
|
||||
If you do this, don't forget to do a:
|
||||
+
|
||||
....
|
||||
cd "$(./getvar parsec_source_dir)"
|
||||
git clean -xdf .
|
||||
....
|
||||
before going for the cross compile build.
|
||||
+
|
||||
* patch Buildroot to work well, and keep cross compiling all the way. This should be totally viable, and we should do it.
|
||||
+
|
||||
Don't forget to explicitly rebuild PARSEC with:
|
||||
+
|
||||
....
|
||||
./build-buildroot \
|
||||
--arch arm \
|
||||
--config 'BR2_PACKAGE_PARSEC_BENCHMARK=y' \
|
||||
-- parsec_benchmark-reconfigure \
|
||||
;
|
||||
....
|
||||
+
|
||||
You may also want to test if your patches are still functionally correct inside of QEMU first, which is a faster emulator.
|
||||
* sell your soul, and compile natively inside the guest. We won't do this, not only because it is evil, but also because Buildroot explicitly does not support it: https://buildroot.org/downloads/manual/manual.html#faq-no-compiler-on-target ARM employees have been known to do this: https://github.com/arm-university/arm-gem5-rsk/blob/aa3b51b175a0f3b6e75c9c856092ae0c8f2a7cdc/parsec_patches/qemu-patch.diff
|
||||
|
||||
=== Userland content bibliography
|
||||
|
||||
* The Linux Programming Interface by Michael Kerrisk https://www.amazon.co.uk/Linux-Programming-Interface-System-Handbook/dp/1593272200 Lots of open source POSIX examples: https://github.com/cirosantilli/linux-programming-interface-kerrisk
|
||||
|
||||
33
common.py
33
common.py
@@ -1941,3 +1941,36 @@ class TestCliFunction(LkmcCliFunction):
|
||||
self.log_error('A test failed')
|
||||
return 1
|
||||
return 0
|
||||
|
||||
# IO format.
|
||||
|
||||
class LkmcList(list):
|
||||
'''
|
||||
list with a lightweight serialization format for algorithm IO.
|
||||
'''
|
||||
def __init__(self, *args, **kwargs):
|
||||
if 'oneline' in kwargs:
|
||||
self.oneline = kwargs['oneline']
|
||||
del kwargs['oneline']
|
||||
else:
|
||||
self.oneline = False
|
||||
super().__init__(*args, **kwargs)
|
||||
def __str__(self):
|
||||
if self.oneline:
|
||||
sep = ' '
|
||||
else:
|
||||
sep = '\n'
|
||||
return sep.join([str(item) for item in self])
|
||||
|
||||
class LkmcOrderedDict(collections.OrderedDict):
|
||||
'''
|
||||
dict with a lightweight serialization format for algorithm IO.
|
||||
'''
|
||||
def __str__(self):
|
||||
out = []
|
||||
for key in self:
|
||||
out.extend([
|
||||
str(key),
|
||||
str(self[key]) + '\n',
|
||||
])
|
||||
return '\n'.join(out)
|
||||
|
||||
30
userland/algorithm/set/generate_io
Executable file
30
userland/algorithm/set/generate_io
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
import argparse
|
||||
import random
|
||||
import sys
|
||||
import os
|
||||
|
||||
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))))
|
||||
import common
|
||||
|
||||
# Handle CLI arguments.
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--min', type=int, default=0)
|
||||
parser.add_argument('--max', type=int, default=(2**32 - 1))
|
||||
parser.add_argument('--seed', type=int)
|
||||
parser.add_argument('--size', type=int, default=1000000)
|
||||
parser.add_argument('--unique', type=bool, default=True,
|
||||
help='if True, remove duplicates from the expected output')
|
||||
args = parser.parse_args()
|
||||
random.seed(args.seed)
|
||||
input_data = common.LkmcList()
|
||||
for i in range(args.size):
|
||||
input_data.append(random.randint(args.min, args.max))
|
||||
with open('tmp.i', 'w') as i:
|
||||
i.write(str(input_data) + '\n')
|
||||
if args.unique:
|
||||
input_data = common.LkmcList(set(input_data))
|
||||
input_data.sort()
|
||||
with open('tmp.e', 'w') as e:
|
||||
e.write(str(input_data) + '\n')
|
||||
@@ -20,7 +20,7 @@
|
||||
int main(int argc, char **argv) {
|
||||
typedef uint64_t T;
|
||||
#if LKMC_ALGORITHM_SET_STD_PRIORITY_QUEUE
|
||||
std::priority_queue<T> set;
|
||||
std::priority_queue<T, std::vector<T>, std::greater<int>> set;
|
||||
#endif
|
||||
#if LKMC_ALGORITHM_SET_STD_SET
|
||||
std::set<T> set;
|
||||
@@ -28,9 +28,8 @@ int main(int argc, char **argv) {
|
||||
#if LKMC_ALGORITHM_SET_STD_UNORDERED_SET
|
||||
std::unordered_set<T> set;
|
||||
#endif
|
||||
std::vector<T> randoms;
|
||||
std::vector<T> input;
|
||||
size_t i, j = 0, n, granule, base;
|
||||
unsigned int seed;
|
||||
#ifndef LKMC_M5OPS_ENABLE
|
||||
std::vector<std::chrono::nanoseconds::rep> dts;
|
||||
std::vector<decltype(base)> bases;
|
||||
@@ -38,26 +37,21 @@ int main(int argc, char **argv) {
|
||||
|
||||
// CLI arguments.
|
||||
if (argc > 1) {
|
||||
n = std::stoi(argv[1]);
|
||||
} else {
|
||||
n = 10;
|
||||
}
|
||||
if (argc > 2) {
|
||||
granule = std::stoi(argv[2]);
|
||||
granule = std::stoi(argv[1]);
|
||||
} else {
|
||||
granule = 1;
|
||||
}
|
||||
if (argc > 3) {
|
||||
seed = std::stoi(argv[3]);
|
||||
} else {
|
||||
seed = std::random_device()();
|
||||
|
||||
// Read input from stdin.
|
||||
std::string str;
|
||||
while (std::getline(std::cin, str)) {
|
||||
if (str == "")
|
||||
break;
|
||||
input.push_back(std::stoll(str));
|
||||
}
|
||||
n = input.size();
|
||||
|
||||
// Action.
|
||||
for (i = 0; i < n; ++i) {
|
||||
randoms.push_back(i);
|
||||
}
|
||||
std::shuffle(randoms.begin(), randoms.end(), std::mt19937(seed));
|
||||
for (i = 0; i < n / granule; ++i) {
|
||||
#ifndef LKMC_M5OPS_ENABLE
|
||||
using clk = std::chrono::high_resolution_clock;
|
||||
@@ -71,9 +65,9 @@ int main(int argc, char **argv) {
|
||||
for (j = 0; j < granule; ++j) {
|
||||
#endif
|
||||
#if LKMC_ALGORITHM_SET_STD_PRIORITY_QUEUE
|
||||
set.emplace(randoms[base + j]);
|
||||
set.emplace(input[base + j]);
|
||||
#else
|
||||
set.insert(randoms[base + j]);
|
||||
set.insert(input[base + j]);
|
||||
#endif
|
||||
#ifdef LKMC_M5OPS_ENABLE
|
||||
LKMC_M5OPS_DUMPSTATS;
|
||||
@@ -87,8 +81,29 @@ int main(int argc, char **argv) {
|
||||
}
|
||||
|
||||
// Report results.
|
||||
std::cout << "output" << std::endl;
|
||||
#if LKMC_ALGORITHM_SET_STD_PRIORITY_QUEUE
|
||||
while (!set.empty()) {
|
||||
std::cout << set.top() << std::endl;
|
||||
set.pop();
|
||||
}
|
||||
//T last_val = set.top();
|
||||
//std::cout << last_val << std::endl;
|
||||
//set.pop();
|
||||
//while (!set.empty()) {
|
||||
// const auto& val = set.top();
|
||||
// if (val != last_val)
|
||||
// std::cout << val << std::endl;
|
||||
// last_val = val;
|
||||
// set.pop();
|
||||
//}
|
||||
#else
|
||||
for (const auto& item : set) {
|
||||
std::cout << item << std::endl;
|
||||
}
|
||||
#endif
|
||||
std::cout << std::endl;
|
||||
#ifndef LKMC_M5OPS_ENABLE
|
||||
// Output.
|
||||
std::cout << "times" << std::endl;
|
||||
auto bases_it = bases.begin();
|
||||
auto dts_it = dts.begin();
|
||||
@@ -99,17 +114,5 @@ int main(int argc, char **argv) {
|
||||
bases_it++;
|
||||
dts_it++;
|
||||
}
|
||||
std::cout << std::endl;
|
||||
std::cout << "output" << std::endl;
|
||||
#if LKMC_ALGORITHM_SET_STD_PRIORITY_QUEUE
|
||||
while (!set.empty()) {
|
||||
std::cout << set.top() << std::endl;
|
||||
set.pop();
|
||||
}
|
||||
#else
|
||||
for (const auto& item : set) {
|
||||
std::cout << item << std::endl;
|
||||
}
|
||||
#endif
|
||||
#endif
|
||||
}
|
||||
|
||||
40
userland/algorithm/set/parse_output
Executable file
40
userland/algorithm/set/parse_output
Executable file
@@ -0,0 +1,40 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
import argparse
|
||||
import collections
|
||||
import sys
|
||||
import os
|
||||
|
||||
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))))
|
||||
import common
|
||||
|
||||
data = common.LkmcOrderedDict()
|
||||
|
||||
# Parse
|
||||
|
||||
output = common.LkmcList()
|
||||
next(sys.stdin)
|
||||
for line in sys.stdin:
|
||||
line = line.rstrip()
|
||||
if line == '':
|
||||
break
|
||||
output.append(int(line))
|
||||
data['output'] = output
|
||||
|
||||
times = common.LkmcList()
|
||||
next(sys.stdin)
|
||||
for line in sys.stdin:
|
||||
line = line.rstrip()
|
||||
if line == '':
|
||||
break
|
||||
times.append(common.LkmcList([int(i) for i in line.split(' ')], oneline=True))
|
||||
data['times'] = times
|
||||
|
||||
# Handle CLI arguments.
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('key', nargs='?')
|
||||
args = parser.parse_args()
|
||||
if args.key:
|
||||
print(data[args.key])
|
||||
else:
|
||||
print(data)
|
||||
3
userland/algorithm/set/test_data/3.e
Normal file
3
userland/algorithm/set/test_data/3.e
Normal file
@@ -0,0 +1,3 @@
|
||||
0
|
||||
1
|
||||
2
|
||||
3
userland/algorithm/set/test_data/3.i
Normal file
3
userland/algorithm/set/test_data/3.i
Normal file
@@ -0,0 +1,3 @@
|
||||
1
|
||||
2
|
||||
0
|
||||
4
userland/algorithm/set/test_data/4.e
Normal file
4
userland/algorithm/set/test_data/4.e
Normal file
@@ -0,0 +1,4 @@
|
||||
0
|
||||
1
|
||||
2
|
||||
3
|
||||
4
userland/algorithm/set/test_data/4.i
Normal file
4
userland/algorithm/set/test_data/4.i
Normal file
@@ -0,0 +1,4 @@
|
||||
1
|
||||
3
|
||||
2
|
||||
0
|
||||
5
userland/algorithm/set/test_data/5.e
Normal file
5
userland/algorithm/set/test_data/5.e
Normal file
@@ -0,0 +1,5 @@
|
||||
0
|
||||
1
|
||||
2
|
||||
3
|
||||
4
|
||||
5
userland/algorithm/set/test_data/5.i
Normal file
5
userland/algorithm/set/test_data/5.i
Normal file
@@ -0,0 +1,5 @@
|
||||
1
|
||||
4
|
||||
0
|
||||
2
|
||||
3
|
||||
8
userland/algorithm/set/test_data/8.e
Normal file
8
userland/algorithm/set/test_data/8.e
Normal file
@@ -0,0 +1,8 @@
|
||||
0
|
||||
1
|
||||
2
|
||||
3
|
||||
4
|
||||
5
|
||||
6
|
||||
7
|
||||
8
userland/algorithm/set/test_data/8.i
Normal file
8
userland/algorithm/set/test_data/8.i
Normal file
@@ -0,0 +1,8 @@
|
||||
4
|
||||
5
|
||||
6
|
||||
2
|
||||
1
|
||||
3
|
||||
0
|
||||
7
|
||||
Reference in New Issue
Block a user