History log of /openssl/crypto/chacha/asm/chacha-armv8-sve.pl (Results 1 – 2 of 2)
Revision Date Author Comments
# bcb52bcc 25-May-2022 Daniel Hu

Optimize chacha20 on aarch64 by SVE2

This patch improves existing chacha20 SVE patch by using SVE2,
which is an optional architecture feature of aarch64, with XAR
instruction that ca

Optimize chacha20 on aarch64 by SVE2

This patch improves existing chacha20 SVE patch by using SVE2,
which is an optional architecture feature of aarch64, with XAR
instruction that can improve the performance of chacha20.

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18522)

show more ...


# b1b2146d 07-Feb-2022 Daniel Hu

Acceleration of chacha20 on aarch64 by SVE

This patch accelerates chacha20 on aarch64 when Scalable Vector Extension
(SVE) is supported by CPU. Tested on modern micro-architecture with

Acceleration of chacha20 on aarch64 by SVE

This patch accelerates chacha20 on aarch64 when Scalable Vector Extension
(SVE) is supported by CPU. Tested on modern micro-architecture with
256-bit SVE, it has the potential to improve performance up to 20%

The solution takes a hybrid approach. SVE will handle multi-blocks that fit
the SVE vector length, with Neon/Scalar to process any tail data

Test result:
With SVE
type 1024 bytes 8192 bytes 16384 bytes
ChaCha20 1596208.13k 1650010.79k 1653151.06k

Without SVE (by Neon/Scalar)
type 1024 bytes 8192 bytes 16384 bytes
chacha20 1355487.91k 1372678.83k 1372662.44k

The assembly code has been reviewed internally by
ARM engineer Fangming.Fang@arm.com

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17916)

show more ...