From fe3e97f70cf5ec8ca2a879e31b96e71772fbb296 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ciro=20Santilli=20=E5=85=AD=E5=9B=9B=E4=BA=8B=E4=BB=B6=20?= =?UTF-8?q?=E6=B3=95=E8=BD=AE=E5=8A=9F?= Date: Wed, 28 Aug 2019 00:00:02 +0000 Subject: [PATCH] sve: improve didactics --- README.adoc | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/README.adoc b/README.adoc index eefc4c9..a31b524 100644 --- a/README.adoc +++ b/README.adoc @@ -15386,17 +15386,22 @@ There are analogous LD3 and LD4 instruction. ==== ARM SVE +Scalable Vector Extension. + Example: link:userland/arch/aarch64/sve.S[] -Scalable Vector Extension. +To understand it, the first thing you have to look at is the execution example at Fig 1 of: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf aarch64 only, newer than <>. It is called Scalable because it does not specify the vector width! Therefore we don't have to worry about new vector width instructions every few years! Hurray! -The instructions then allow implicitly tracking the loop index without knowing the actual vector length. +The instructions then allow: -Added to QEMU use mode in 3.0.0. +* incrementing loop index by the vector length without explicitly hardcoding it +* when the last loop is reached, extra bytes that are not multiples of the vector length get automatically masked out by the predicate register, and have no effect + +Added to QEMU in 3.0.0 and gem5 in 2019 Q3. TODO announcement date. Possibly 2017: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf There is also a 2016 mention: https://community.arm.com/tools/hpc/b/hpc/posts/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture @@ -15411,7 +15416,6 @@ Using SVE normally requires setting the CPACR_EL1.FPEN and ZEN bits, which as as ===== SVE bibliography * https://www.rico.cat/files/ICS18-gem5-sve-tutorial.pdf step by step of a complete code execution examples, the best initial tutorial so far -* https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf paper with some nice few concrete examples, illustrations and rationale * https://static.docs.arm.com/dui0965/c/DUI0965C_scalable_vector_extension_guide.pdf * https://developer.arm.com/products/software-development-tools/hpc/documentation/writing-inline-sve-assembly quick inlining guide