sve: improve didactics

This commit is contained in:
Ciro Santilli 六四事件 法轮功
2019-08-28 00:00:02 +00:00
parent d6d7f15c91
commit fe3e97f70c

View File

@@ -15386,17 +15386,22 @@ There are analogous LD3 and LD4 instruction.
==== ARM SVE
Scalable Vector Extension.
Example: link:userland/arch/aarch64/sve.S[]
Scalable Vector Extension.
To understand it, the first thing you have to look at is the execution example at Fig 1 of: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf
aarch64 only, newer than <<arm-neon>>.
It is called Scalable because it does not specify the vector width! Therefore we don't have to worry about new vector width instructions every few years! Hurray!
The instructions then allow implicitly tracking the loop index without knowing the actual vector length.
The instructions then allow:
Added to QEMU use mode in 3.0.0.
* incrementing loop index by the vector length without explicitly hardcoding it
* when the last loop is reached, extra bytes that are not multiples of the vector length get automatically masked out by the predicate register, and have no effect
Added to QEMU in 3.0.0 and gem5 in 2019 Q3.
TODO announcement date. Possibly 2017: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf There is also a 2016 mention: https://community.arm.com/tools/hpc/b/hpc/posts/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture
@@ -15411,7 +15416,6 @@ Using SVE normally requires setting the CPACR_EL1.FPEN and ZEN bits, which as as
===== SVE bibliography
* https://www.rico.cat/files/ICS18-gem5-sve-tutorial.pdf step by step of a complete code execution examples, the best initial tutorial so far
* https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf paper with some nice few concrete examples, illustrations and rationale
* https://static.docs.arm.com/dui0965/c/DUI0965C_scalable_vector_extension_guide.pdf
* https://developer.arm.com/products/software-development-tools/hpc/documentation/writing-inline-sve-assembly quick inlining guide