[RFC PATCH 0/6] Add AVX2 accelerated implementations for Blowfish,Twofish, Serpent and Camellia

From: Jussi Kivilinna
Date: Sat Apr 13 2013 - 06:46:40 EST


The following series implements four block ciphers - Blowfish, Twofish, Serpent
and Camellia - using AVX2 instruction set. This work on AVX2 implementations
started over year ago and have been available at
https://github.com/jkivilin/crypto-avx2

The Serpent and Camellia implementations are directly based on the word-sliced
and byte-sliced AVX implementations and have been extended to use the 256-bit
YMM registers. As such the performance should be better than with the 128-bit
wide AVX implementations. (Camellia implementation needs some extra handling
for the AES-NI as AES instructions have remained only 128-bit wide.)

Blowfish and Twofish implementations utilize the new vpgatherdd instruction to
perform eight vectorized 8x32-bit table look-ups at once. This is different
from the previous word-sliced AVX implementations, where table look-ups have
to performed through general purpose registers. AVX2 implementations thus
avoid additional moving of data between the SIMD and general purpose registers
and therefore should be faster.

For obvious reasons, I have not tested these implementations on real hardware.
Kernel tcrypt tests have been run under Bochs, which should contain somewhat
working AVX2 implementation. But I cannot be sure, even the Intel SDE emulator
that I used for testing these implementations did not quite follow the specs
(a past version of SDE that I initially used allowed vector registers to
vgather be same, whereas specs say that in such case exception should be
raised). Because of this, the first versions of patchset in above repository
are broken.

So since I'm unable to verify that these implementations work on real hardware
and are unable to conduct real performance evaluation, I'm sending this
patchset as RFC. Maybe someone can actually test these on real hardware and
maybe give acked-by in case these look ok(?). If such is not possible, I'll
do the testing myself when those Haswell processors come available where I
live.

-Jussi

---

Jussi Kivilinna (6):
crypto: testmgr - extend camellia test-vectors for camellia-aesni/avx2
crypto: tcrypt - add async cipher speed tests for blowfish
crypto: blowfish - add AVX2/x86_64 implementation of blowfish cipher
crypto: twofish - add AVX2/x86_64 assembler implementation of twofish cipher
crypto: serpent - add AVX2/x86_64 assembler implementation of serpent cipher
crypto: camellia - add AVX2/AES-NI/x86_64 assembler implementation of camellia cipher


arch/x86/crypto/Makefile | 17
arch/x86/crypto/blowfish-avx2-asm_64.S | 449 +++++++++
arch/x86/crypto/blowfish_avx2_glue.c | 585 +++++++++++
arch/x86/crypto/blowfish_glue.c | 32 -
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 1368 ++++++++++++++++++++++++++
arch/x86/crypto/camellia_aesni_avx2_glue.c | 586 +++++++++++
arch/x86/crypto/camellia_aesni_avx_glue.c | 17
arch/x86/crypto/glue_helper-asm-avx2.S | 180 +++
arch/x86/crypto/serpent-avx2-asm_64.S | 800 +++++++++++++++
arch/x86/crypto/serpent_avx2_glue.c | 562 +++++++++++
arch/x86/crypto/serpent_avx_glue.c | 62 +
arch/x86/crypto/twofish-avx2-asm_64.S | 600 +++++++++++
arch/x86/crypto/twofish_avx2_glue.c | 584 +++++++++++
arch/x86/crypto/twofish_avx_glue.c | 14
arch/x86/include/asm/cpufeature.h | 1
arch/x86/include/asm/crypto/blowfish.h | 43 +
arch/x86/include/asm/crypto/camellia.h | 19
arch/x86/include/asm/crypto/serpent-avx.h | 24
arch/x86/include/asm/crypto/twofish.h | 18
crypto/Kconfig | 88 ++
crypto/tcrypt.c | 15
crypto/testmgr.c | 51 +
crypto/testmgr.h | 1100 ++++++++++++++++++++-
23 files changed, 7128 insertions(+), 87 deletions(-)
create mode 100644 arch/x86/crypto/blowfish-avx2-asm_64.S
create mode 100644 arch/x86/crypto/blowfish_avx2_glue.c
create mode 100644 arch/x86/crypto/camellia-aesni-avx2-asm_64.S
create mode 100644 arch/x86/crypto/camellia_aesni_avx2_glue.c
create mode 100644 arch/x86/crypto/glue_helper-asm-avx2.S
create mode 100644 arch/x86/crypto/serpent-avx2-asm_64.S
create mode 100644 arch/x86/crypto/serpent_avx2_glue.c
create mode 100644 arch/x86/crypto/twofish-avx2-asm_64.S
create mode 100644 arch/x86/crypto/twofish_avx2_glue.c
create mode 100644 arch/x86/include/asm/crypto/blowfish.h

--

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/