mirror of
https://github.com/cirosantilli/linux-kernel-module-cheat.git
synced 2026-01-25 19:21:35 +01:00
assembly: improve organization of simd add
This commit is contained in:
46
README.adoc
46
README.adoc
@@ -11690,13 +11690,13 @@ ____
|
||||
Much like ADD for non-SIMD, start learning SIMD instructions by looking at the integer and floating point SIMD ADD instructions of each ISA:
|
||||
|
||||
* x86
|
||||
** link:userland/arch/x86_64/addpd.S[]: `ADDPS`, `ADDPD`
|
||||
** link:userland/arch/x86_64/paddq.S[]: `PADDQ`, `PADDL`, `PADDW`, `PADDB`
|
||||
** <<x86-addpd-instruction>>
|
||||
** <<x86-paddq-instruction>>
|
||||
* arm
|
||||
** link:userland/arch/arm/vadd.S[]
|
||||
** <<arm-vadd-instruction>>
|
||||
* aarch64
|
||||
** link:userland/arch/aarch64/add_vector.S[]
|
||||
** link:userland/arch/aarch64/fadd_vector.S[]
|
||||
** <<armv8-aarch64-add-vector-instruction>>
|
||||
** <<armv8-aarch64-fadd-vector-instruction>>
|
||||
|
||||
Then it is just a huge copy paste of infinite boring details:
|
||||
|
||||
@@ -12009,6 +12009,20 @@ History:
|
||||
* AVX2:2013
|
||||
* AVX-512: 2016. 512-bit ZMM registers. Extension of YMM.
|
||||
|
||||
===== x86 SSE2
|
||||
|
||||
====== x86 addpd instruction
|
||||
|
||||
link:userland/arch/x86_64/addpd.S[]: `addps`, `addpd`
|
||||
|
||||
Good first instruction to learn SIMD: <<simd-assembly>>
|
||||
|
||||
====== x86 paddq instruction
|
||||
|
||||
link:userland/arch/x86_64/paddq.S[]: `paddq`, `paddl`, `paddw`, `paddb`
|
||||
|
||||
Good first instruction to learn SIMD: <<simd-assembly>>
|
||||
|
||||
=== rdtsc
|
||||
|
||||
TODO: review this section, make a more controlled userland experiment with <<m5ops>> instrumentation.
|
||||
@@ -12594,11 +12608,11 @@ Bibliography: https://stackoverflow.com/questions/1875491/nop-for-iphone-binarie
|
||||
|
||||
==== ARM fadd vs vadd
|
||||
|
||||
It is very confusing, but `fadds` and `faddd` in Aarch32 are <<gnu-gas-assembler-arm-unified-syntax,pre-UAL>> for `vadd.f32` and `vadd.f64`.
|
||||
It is very confusing, but `fadds` and `faddd` in Aarch32 are <<gnu-gas-assembler-arm-unified-syntax,pre-UAL>> for `vadd.f32` and `vadd.f64` which we use in this tutorial: <<arm-vadd-instruction>>
|
||||
|
||||
The same goes for most ARMv7 mnemonics: `f*` is old, and `v*` is the newer better syntax.
|
||||
|
||||
But then, in ARMv8, they decided to use `fadd` as the main floating point add name, and get rid of `vadd`!
|
||||
But then, in ARMv8, they decided to use <<armv8-aarch64-fadd-vector-instruction>> as the main floating point add name, and get rid of `vadd`!
|
||||
|
||||
Also keep in mind that fused multiply add is `fmadd`.
|
||||
|
||||
@@ -12606,6 +12620,24 @@ Examples at: <<simd-assembly>>
|
||||
|
||||
==== ARM SIMD instructions
|
||||
|
||||
===== ARM vadd instruction
|
||||
|
||||
link:userland/arch/arm/vadd.S[]
|
||||
|
||||
Good first instruction to learn SIMD: <<simd-assembly>>
|
||||
|
||||
===== ARMv8 aarch64 add vector instruction
|
||||
|
||||
link:userland/arch/aarch64/add_vector.S[]
|
||||
|
||||
Good first instruction to learn SIMD: <<simd-assembly>>
|
||||
|
||||
===== ARMv8 aarch64 fadd vector instruction
|
||||
|
||||
link:userland/arch/aarch64/fadd_vector.S[]
|
||||
|
||||
Good first instruction to learn SIMD: <<simd-assembly>>
|
||||
|
||||
===== ARM vcvt instruction
|
||||
|
||||
Example: link:userland/arch/arm/vcvt.S[]
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#simd-assembly
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#armv8-aarch64-add-vector-instruction
|
||||
*
|
||||
* Add a bunch of integers in one go.
|
||||
*/
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#simd-assembly
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#armv8-aarch64-fadd-vector-instruction
|
||||
*
|
||||
* Add a bunch of floating point numbers in one go.
|
||||
*/
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#arm-simd-instruction-assembly */
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#arm-vadd-instruction */
|
||||
|
||||
#include "common.h"
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-simd
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-addpq-instruction
|
||||
*
|
||||
* Add a bunch of floating point numbers in one go.
|
||||
*/
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#simd-assembly
|
||||
/* https://github.com/cirosantilli/linux-kernel-module-cheat#x86-paddq-instruction
|
||||
*
|
||||
* Add a bunch of integers in one go.
|
||||
*
|
||||
|
||||
Reference in New Issue
Block a user