From 7d9327c6c8e72aa4e2554189dec6fb8402be5d72 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ciro=20Santilli=20=E5=85=AD=E5=9B=9B=E4=BA=8B=E4=BB=B6=20?= =?UTF-8?q?=E6=B3=95=E8=BD=AE=E5=8A=9F?= Date: Thu, 28 Nov 2019 00:00:01 +0000 Subject: [PATCH] check gem5 DMIPS for STREAM --- README.adoc | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/README.adoc b/README.adoc index e57fa6e..80de56d 100644 --- a/README.adoc +++ b/README.adoc @@ -14491,7 +14491,7 @@ Dhrystone is very simple: The benchmark is single-threaded. -After a quick look at it, Dhrystone in `-O3` is is very likely completely CPU bound, as there are no loops over variable sized arrays, except for some dummy ones that only run once. It just does a bunch of operations on local and global C variables, which are very likely to be inlined and treated fully in registers until the final write back. TODO confirm with some kind of measurement. +After a quick look at it, Dhrystone in `-O3` is is very likely completely CPU bound, as there are no loops over variable sized arrays, except for some dummy ones that only run once. It just does a bunch of operations on local and global C variables, which are very likely to be inlined and treated fully in registers until the final write back, or to fit entirely in cache. TODO confirm with some kind of measurement. The benchmark also makes no syscalls except for measuring time and reporting results. <> has a `dhrystone` package, but because it is so interesting to us, we decided to also build it ourselves, which allows things like static and baremetal compilation more easily. @@ -19091,7 +19091,7 @@ Summary of manually collected results on <> at LKMC a18f28e263c91362519ef55 |1.1005150 * 10^7 |0.2 -|a605448f07e6380634b1aa7e9732d111759f69fd + 1 +|605448f07e6380634b1aa7e9732d111759f69fd |<> -O3 |`gem5 --arch aarch64` |4 * 10^5 @@ -19099,6 +19099,14 @@ Summary of manually collected results on <> at LKMC a18f28e263c91362519ef55 |9.2034139 * 10^7 |1.6 +|5d233f2664a78789f9907d27e2a40e86cefad595 +|<> -O3 +|`gem5 --arch aarch64 --userland-args 300000 2` +|3 * 10^5 * 2 +|64 +|9.9674773 * 10^7 +|1.6 + |=== The first step is to determine a number of loops that will run long enough to have meaningful results, but not too long that we will get bored, so about 1 minute.