History log of /openssl/crypto/x86_64cpuid.pl (Results 1 – 25 of 55)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: openssl-3.0.0-alpha17, openssl-3.0.0-alpha16, openssl-3.0.0-alpha15, openssl-3.0.0-alpha14
# 3c2bdd7d 08-Apr-2021 Matt Caswell

Update copyright year

Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14801)


Revision tags: OpenSSL_1_1_1k, openssl-3.0.0-alpha13, openssl-3.0.0-alpha12, OpenSSL_1_1_1j, openssl-3.0.0-alpha11, openssl-3.0.0-alpha10
# c781eb1c 08-Dec-2020 Andrey Matyukov

Dual 1024-bit exponentiation optimization for Intel IceLake CPU
with AVX512_IFMA + AVX512_VL instructions, primarily for RSA CRT private key
operations. It uses 256-bit registers to avoid CPU

Dual 1024-bit exponentiation optimization for Intel IceLake CPU
with AVX512_IFMA + AVX512_VL instructions, primarily for RSA CRT private key
operations. It uses 256-bit registers to avoid CPU frequency scaling issues.
The performance speedup for RSA2k signature on ICL is ~2x.

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/13750)

show more ...


Revision tags: OpenSSL_1_1_1i, openssl-3.0.0-alpha9, openssl-3.0.0-alpha8, openssl-3.0.0-alpha7, OpenSSL_1_1_1h, openssl-3.0.0-alpha6, openssl-3.0.0-alpha5, openssl-3.0.0-alpha4, openssl-3.0.0-alpha3, openssl-3.0.0-alpha2, openssl-3.0.0-alpha1
# 33388b44 23-Apr-2020 Matt Caswell

Update copyright year

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/11616)


Revision tags: OpenSSL_1_1_1g, OpenSSL_1_1_1f, OpenSSL_1_1_1e
# a21314db 17-Feb-2020 David Benjamin

Also check for errors in x86_64-xlate.pl.

In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude
the perlasm drivers since they aren't opening pipes and do not
partic

Also check for errors in x86_64-xlate.pl.

In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude
the perlasm drivers since they aren't opening pipes and do not
particularly need it, but I only noticed x86_64-xlate.pl, so
arm-xlate.pl and ppc-xlate.pl got the change.

That seems to have been fine, so be consistent and also apply the change
to x86_64-xlate.pl. Checking for errors is generally a good idea.

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/10930)

show more ...


# 98ad3fe8 31-Jan-2020 H.J. Lu

x86_64: Add endbranch at function entries for Intel CET

To support Intel CET, all indirect branch targets must start with
endbranch. Here is a patch to add endbranch to function entries

x86_64: Add endbranch at function entries for Intel CET

To support Intel CET, all indirect branch targets must start with
endbranch. Here is a patch to add endbranch to function entries
in x86_64 assembly codes which are indirect branch targets as
discovered by running openssl testsuite on Intel CET machine and
visual inspection.

Verified with

$ CC="gcc -Wl,-z,cet-report=error" ./Configure shared linux-x86_64 -fcf-protection
$ make
$ make test

and

$ CC="gcc -mx32 -Wl,-z,cet-report=error" ./Configure shared linux-x32 -fcf-protection
$ make
$ make test # <<< passed with https://github.com/openssl/openssl/pull/10988

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10982)

show more ...


# 32be631c 17-Jan-2020 David Benjamin

Do not silently truncate files on perlasm errors

If one of the perlasm xlate drivers crashes, OpenSSL's build will
currently swallow the error and silently truncate the output to however

Do not silently truncate files on perlasm errors

If one of the perlasm xlate drivers crashes, OpenSSL's build will
currently swallow the error and silently truncate the output to however
far the driver got. This will hopefully fail to build, but better to
check such things.

Handle this by checking for errors when closing STDOUT (which is a pipe
to the xlate driver).

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10883)

show more ...


Revision tags: OpenSSL_1_0_2u
# 8913378a 17-Dec-2019 Bernd Edlinger

Fix unwind info for some trivial functions

While stack unwinding works with gdb here, the
function _Unwind_Backtrace gives up when something outside
.cfi_startproc/.cfi_endproc is fo

Fix unwind info for some trivial functions

While stack unwinding works with gdb here, the
function _Unwind_Backtrace gives up when something outside
.cfi_startproc/.cfi_endproc is found in the call stack, like
OPENSSL_cleanse, OPENSSL_atomic_add, OPENSSL_rdtsc, CRYPTO_memcmp
and other trivial functions which don't save anything in the stack.

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/10635)

show more ...


# 1aa89a7a 12-Sep-2019 Richard Levitte

Unify all assembler file generators

They now generally conform to the following argument sequence:

script.pl "$(PERLASM_SCHEME)" [ C preprocessor arguments ... ] \

Unify all assembler file generators

They now generally conform to the following argument sequence:

script.pl "$(PERLASM_SCHEME)" [ C preprocessor arguments ... ] \
$(PROCESSOR) <output file>

However, in the spirit of being able to use these scripts manually,
they also allow for no argument, or for only the flavour, or for only
the output file. This is done by only using the last argument as
output file if it's a file (it has an extension), and only using the
first argument as flavour if it isn't a file (it doesn't have an
extension).

While we're at it, we make all $xlate calls the same, i.e. the $output
argument is always quoted, and we always die on error when trying to
start $xlate.

There's a perl lesson in this, regarding operator priority...

This will always succeed, even when it fails:

open FOO, "something" || die "ERR: $!";

The reason is that '||' has higher priority than list operators (a
function is essentially a list operator and gobbles up everything
following it that isn't lower priority), and since a non-empty string
is always true, so that ends up being exactly the same as:

open FOO, "something";

This, however, will fail if "something" can't be opened:

open FOO, "something" or die "ERR: $!";

The reason is that 'or' has lower priority that list operators,
i.e. it's performed after the 'open' call.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)

show more ...


Revision tags: OpenSSL_1_0_2t, OpenSSL_1_1_0l, OpenSSL_1_1_1d, OpenSSL_1_1_1c, OpenSSL_1_1_0k, OpenSSL_1_0_2s, OpenSSL_1_0_2r, OpenSSL_1_1_1b
# 0e9725bc 06-Dec-2018 Richard Levitte

Following the license change, modify the boilerplates in crypto/

[skip ci]

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/7827)


Revision tags: OpenSSL_1_0_2q, OpenSSL_1_1_0j, OpenSSL_1_1_1a, OpenSSL_1_1_1, OpenSSL_1_1_1-pre9, OpenSSL_1_0_2p, OpenSSL_1_1_0i, OpenSSL_1_1_1-pre8, OpenSSL_1_1_1-pre7
# 9a708bf9 20-May-2018 Andy Polyakov

{arm64|x86_64}cpuid.pl: add special 16-byte case to OPENSSL_memcmp.

OPENSSL_memcmp is a must in GCM decrypt and general-purpose loop takes
quite a portion of execution time for short inp

{arm64|x86_64}cpuid.pl: add special 16-byte case to OPENSSL_memcmp.

OPENSSL_memcmp is a must in GCM decrypt and general-purpose loop takes
quite a portion of execution time for short inputs, more than GHASH for
few-byte inputs according to profiler. Special 16-byte case takes it off
top five list in profiler output.

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6312)

show more ...


Revision tags: OpenSSL_1_1_1-pre6, OpenSSL_1_1_1-pre5, OpenSSL_1_1_1-pre4, OpenSSL_1_0_2o, OpenSSL_1_1_0h, OpenSSL_1_1_1-pre3
# 082193ef 07-Mar-2018 Bryan Donlan

Fix issues in ia32 RDRAND asm leading to reduced entropy

This patch fixes two issues in the ia32 RDRAND assembly code that result in a
(possibly significant) loss of entropy.

Th

Fix issues in ia32 RDRAND asm leading to reduced entropy

This patch fixes two issues in the ia32 RDRAND assembly code that result in a
(possibly significant) loss of entropy.

The first, less significant, issue is that, by returning success as 0 from
OPENSSL_ia32_rdrand() and OPENSSL_ia32_rdseed(), a subtle bias was introduced.
Specifically, because the assembly routine copied the remaining number of
retries over the result when RDRAND/RDSEED returned 'successful but zero', a
bias towards values 1-8 (primarily 8) was introduced.

The second, more worrying issue was that, due to a mixup in registers, when a
buffer that was not size 0 or 1 mod 8 was passed to OPENSSL_ia32_rdrand_bytes
or OPENSSL_ia32_rdseed_bytes, the last (n mod 8) bytes were all the same value.
This issue impacts only the 64-bit variant of the assembly.

This change fixes both issues by first eliminating the only use of
OPENSSL_ia32_rdrand, replacing it with OPENSSL_ia32_rdrand_bytes, and fixes the
register mixup in OPENSSL_ia32_rdrand_bytes. It also adds a sanity test for
OPENSSL_ia32_rdrand_bytes and OPENSSL_ia32_rdseed_bytes to help catch problems
of this nature in the future.

Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5342)

show more ...


Revision tags: OpenSSL_1_1_1-pre2, OpenSSL_1_1_1-pre1, OpenSSL_1_0_2n
# 79337628 04-Dec-2017 Andy Polyakov

crypto/x86_64cpuid.pl: suppress AVX512F flag on Skylake-X.

It was observed that AVX512 code paths can negatively affect overall
Skylake-X system performance. But we are talking specifica

crypto/x86_64cpuid.pl: suppress AVX512F flag on Skylake-X.

It was observed that AVX512 code paths can negatively affect overall
Skylake-X system performance. But we are talking specifically about
512-bit code, while AVX512VL, 256-bit variant of AVX512F instructions,
is supposed to fly as smooth as AVX2. Which is why it remains unmasked.

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4838)

show more ...


# 88ac224c 22-Nov-2017 Andy Polyakov

crypto/x86_64cpuid.pl: fix AVX512 capability masking.

Originally it was thought that it's possible to use AVX512VL+BW
instructions with XMM and YMM registers without kernel enabling

crypto/x86_64cpuid.pl: fix AVX512 capability masking.

Originally it was thought that it's possible to use AVX512VL+BW
instructions with XMM and YMM registers without kernel enabling
ZMM support, but it turned to be wrong assumption.

Reviewed-by: Rich Salz <rsalz@openssl.org>

show more ...


# d6ee8f3d 05-Nov-2017 Andy Polyakov

OPENSSL_ia32cap: reserve for new extensions.

Reviewed-by: Rich Salz <rsalz@openssl.org>


Revision tags: OpenSSL_1_0_2m, OpenSSL_1_1_0g
# d67e7554 26-Jul-2017 David Benjamin

Fix comment typo.

Reviewed-by: Ben Kaduk <kaduk@mit.edu>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4023)


# d84df594 24-Jul-2017 Andy Polyakov

crypto/x86_64cpuid.pl: fix typo in Knights Landing detection.

Thanks to David Benjamin for spotting this!

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.

crypto/x86_64cpuid.pl: fix typo in Knights Landing detection.

Thanks to David Benjamin for spotting this!

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4009)

show more ...


# 64d92d74 20-Jul-2017 Andy Polyakov

x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.

"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capabi

x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.

"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

AVX-512 results cover even Skylake-X :-)

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Rich Salz <rsalz@openssl.org>

show more ...


Revision tags: OpenSSL_1_0_2l, OpenSSL_1_1_0f, OpenSSL-fips-2_0_16
# 1aed5e1a 12-Mar-2017 Andy Polyakov

crypto/x86*cpuid.pl: move extended feature detection.

Exteneded feature flags were not pulled on AMD processors, as result
a number of extensions were effectively masked on Ryzen. Origin

crypto/x86*cpuid.pl: move extended feature detection.

Exteneded feature flags were not pulled on AMD processors, as result
a number of extensions were effectively masked on Ryzen. Original fix
for x86_64cpuid.pl addressed this problem, but messed up processor
vendor detection. This fix moves extended feature detection past
basic feature detection where it belongs. 32-bit counterpart is
harmonized too.

Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>

show more ...


# f8418d87 05-Mar-2017 Andy Polyakov

crypto/x86_64cpuid.pl: move extended feature detection upwards.

Exteneded feature flags were not pulled on AMD processors, as result a
number of extensions were effectively masked on Ryz

crypto/x86_64cpuid.pl: move extended feature detection upwards.

Exteneded feature flags were not pulled on AMD processors, as result a
number of extensions were effectively masked on Ryzen. It should have
been reported for Excavator since it implements AVX2 extension, but
apparently nobody noticed or cared...

Reviewed-by: Rich Salz <rsalz@openssl.org>

show more ...


# 5e32cfb2 25-Feb-2017 Andy Polyakov

crypto/x86_64cpuid.pl: add CFI annotations.

Reviewed-by: Rich Salz <rsalz@openssl.org>


Revision tags: OpenSSL_1_1_0e
# 66bee01c 27-Jan-2017 Andy Polyakov

crypto/x86_64cpuid.pl: detect if kernel preserves %zmm registers.

Reviewed-by: Rich Salz <rsalz@openssl.org>


Revision tags: OpenSSL_1_0_2k, OpenSSL_1_1_0d, OpenSSL-fips-2_0_15, OpenSSL-fips-2_0_14, OpenSSL_1_1_0c, OpenSSL_1_0_2j, OpenSSL_1_1_0b, OpenSSL_1_0_1u, OpenSSL_1_0_2i, OpenSSL_1_1_0a, OpenSSL_1_1_0, OpenSSL_1_1_0-pre6
# 9c940446 10-Jul-2016 Andy Polyakov

crypto/x86[_64]cpuid.pl: add OPENSSL_ia32_rd[rand|seed]_bytes.

Reviewed-by: Richard Levitte <levitte@openssl.org>


Revision tags: OpenSSL-fips-2_0_13
# cfe1d992 28-May-2016 Andy Polyakov

x86_64 assembly pack: tolerate spaces in source directory name.

[as it is now quoting $output is not required, but done just in case]

Reviewed-by: Richard Levitte <levitte@openssl.o

x86_64 assembly pack: tolerate spaces in source directory name.

[as it is now quoting $output is not required, but done just in case]

Reviewed-by: Richard Levitte <levitte@openssl.org>

show more ...


# e33826f0 15-May-2016 Andy Polyakov

Add assembly CRYPTO_memcmp.

GH: #102

Reviewed-by: Richard Levitte <levitte@openssl.org>


Revision tags: OpenSSL_1_0_1t, OpenSSL_1_0_2h
# e0a65194 20-Apr-2016 Rich Salz

Copyright consolidation: perl files

Add copyright to most .pl files
This does NOT cover any .pl file that has other copyright in it.
Most of those are Andy's but some are public doma

Copyright consolidation: perl files

Add copyright to most .pl files
This does NOT cover any .pl file that has other copyright in it.
Most of those are Andy's but some are public domain.
Fix typo's in some existing files.

Reviewed-by: Richard Levitte <levitte@openssl.org>

show more ...


123