History log of /openssl/include/crypto/sm4_platform.h (Results 1 – 10 of 10)
Revision Date Author Comments
# 7ed6de99 05-Sep-2024 Tomas Mraz

Copyright year updates


Reviewed-by: Neil Horman <nhorman@openssl.org>
Release: yes


# 6cf42ad3 24-May-2024 Hongren Zheng

riscv: Fix cpuid_obj asm checks for sm4/sm3

Similar to #22881 / #23752

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Yang <kaishen.yy@antfin.com>
(Merged fro

riscv: Fix cpuid_obj asm checks for sm4/sm3

Similar to #22881 / #23752

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Yang <kaishen.yy@antfin.com>
(Merged from https://github.com/openssl/openssl/pull/24486)

show more ...


# 7543bb3a 18-Jan-2023 Christoph Müllner

riscv: SM4: Provide a Zvksed-based implementation

The upcoming RISC-V vector crypto extensions feature
a Zvksed extension, that provides SM4-specific instructions.
This patch provide

riscv: SM4: Provide a Zvksed-based implementation

The upcoming RISC-V vector crypto extensions feature
a Zvksed extension, that provides SM4-specific instructions.
This patch provides an implementation that utilizes this
extension if available.

Tested on QEMU and no regressions observed.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)

show more ...


# 636ee1d0 07-Aug-2023 Evgeny Karpov

* Enable extra Arm64 optimization on Windows for GHASH, RAND and AES

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https:/

* Enable extra Arm64 optimization on Windows for GHASH, RAND and AES

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21673)

show more ...


# da1c088f 07-Sep-2023 Matt Caswell

Copyright year updates


Reviewed-by: Richard Levitte <levitte@openssl.org>
Release: yes


# 09cb8718 27-Mar-2023 Tom Cosgrove

SM4 check should be for __aarch64__, not __ARM_MAX_ARCH__ >= 8

(And then __arm__ and __arm tests are redundant)

Fixes #20604

Change-Id: I4308e75b7fbf3be7b46490c3ea4125e2d91

SM4 check should be for __aarch64__, not __ARM_MAX_ARCH__ >= 8

(And then __arm__ and __arm tests are redundant)

Fixes #20604

Change-Id: I4308e75b7fbf3be7b46490c3ea4125e2d91b00b8

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20620)

show more ...


# c007203b 18-Jan-2023 Xu Yizhou

SM4 AESE optimization for ARMv8

Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merge

SM4 AESE optimization for ARMv8

Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19914)

show more ...


# 88c53cf1 31-Oct-2022 Xu Yizhou

Apply SM4 optimization patch to Kunpeng-920

In the ideal scenario, performance can reach up to 2.2X.
But in single block input or CFB/OFB mode, CBC encryption,
performance could drop

Apply SM4 optimization patch to Kunpeng-920

In the ideal scenario, performance can reach up to 2.2X.
But in single block input or CFB/OFB mode, CBC encryption,
performance could drop about 50%.

Perf data on Kunpeng-920 2.6GHz hardware, before and after optimization:

Before:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 75318.96k 79089.62k 79736.15k 79934.12k 80325.44k 80068.61k
SM4-ECB 80211.39k 84998.36k 86472.28k 87024.93k 87144.80k 86862.51k
SM4-GCM 72156.19k 82012.08k 83848.02k 84322.65k 85103.65k 84896.43k
SM4-CBC 77956.13k 80638.81k 81976.17k 81606.31k 82078.91k 81750.70k
SM4-CFB 78078.20k 81054.87k 81841.07k 82396.38k 82203.99k 82236.76k
SM4-OFB 78282.76k 82074.03k 82765.74k 82989.06k 83200.68k 83487.17k

After:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 35678.07k 120687.25k 176632.27k 177192.62k 177586.18k 178295.18k
SM4-ECB 35540.32k 122628.07k 175067.90k 178007.84k 178298.88k 178328.92k
SM4-GCM 34215.75k 116720.50k 170275.16k 171770.88k 172714.21k 172272.30k
SM4-CBC 35645.60k 36544.86k 36515.50k 36732.15k 36618.24k 36629.16k
SM4-CFB 35528.14k 35690.99k 35954.86k 35843.42k 35809.18k 35809.96k
SM4-OFB 35563.55k 35853.56k 35963.05k 36203.52k 36233.85k 36307.82k

Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>

Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19547)

show more ...


# 4908787f 14-Feb-2022 Daniel Hu

SM4 optimization for ARM by ASIMD

This patch optimizes SM4 for ARM processor using ASIMD instruction

It will improve performance if both of following conditions are met:
1) Inpu

SM4 optimization for ARM by ASIMD

This patch optimizes SM4 for ARM processor using ASIMD instruction

It will improve performance if both of following conditions are met:
1) Input data equal to or more than 4 blocks
2) Cipher mode allows parallelism, including ECB,CTR,GCM or CBC decryption

This patch implements SM4 SBOX lookup in vector registers, with the
benefit of constant processing time over existing C implementation.

It is only enabled for micro-architecture N1/V1. In the ideal scenario,
performance can reach up to 2.7X

When either of above two conditions is not met, e.g. single block input
or CFB/OFB mode, CBC encryption, performance could drop about 50%.

The assembly code has been reviewed internally by ARM engineer
Fangming.Fang@arm.com

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17951)

show more ...


# 15b7175f 19-Oct-2021 Daniel Hu

SM4 optimization for ARM by HW instruction

This patch implements the SM4 optimization for ARM processor,
using SM4 HW instruction, which is an optional feature of
crypto extension fo

SM4 optimization for ARM by HW instruction

This patch implements the SM4 optimization for ARM processor,
using SM4 HW instruction, which is an optional feature of
crypto extension for aarch64 V8.

Tested on some modern ARM micro-architectures with SM4 support, the
performance uplift can be observed around 8X~40X over existing
C implementation in openssl. Algorithms that can be parallelized
(like CTR, ECB, CBC decryption) are on higher end, with algorithm
like CBC encryption on lower end (due to inter-block dependency)

Perf data on Yitian-710 2.75GHz hardware, before and after optimization:

Before:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 105787.80k 107837.87k 108380.84k 108462.08k 108549.46k 108554.92k
SM4-ECB 111924.58k 118173.76k 119776.00k 120093.70k 120264.02k 120274.94k
SM4-CBC 106428.09k 109190.98k 109674.33k 109774.51k 109827.41k 109827.41k

After (7.4x - 36.6x faster):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 781979.02k 2432994.28k 3437753.86k 3834177.88k 3963715.58k 3974556.33k
SM4-ECB 937590.69k 2941689.02k 3945751.81k 4328655.87k 4459181.40k 4468692.31k
SM4-CBC 890639.88k 1027746.58k 1050621.78k 1056696.66k 1058613.93k 1058701.31k

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17455)

show more ...