Re: [PATCH 2/3] [CRYPTO] Add optimized SHA-1 implementation for i486+

From: Benjamin Gilbert
Date: Mon Jun 11 2007 - 15:18:19 EST

linux@xxxxxxxxxxx wrote:
/* Majority: (x^y)|(y&z)|(z&x) = (x & z) + ((x ^ z) & y)
#define F3(x,y,z,dest) \
movl z, TMP; \
andl x, TMP; \
addl TMP, dest; \
movl z, TMP; \
xorl x, TMP; \
andl y, TMP; \
addl TMP, dest

Since y is the most recently computed result (it's rotated in the
previous round), I arranged the code to delay its use as late as

Now you have one more register to play with.

Okay, thanks. It doesn't actually give one more register except in the F3 rounds (TMP2 is normally used to hold the magic constants) but it's a good cleanup.

A faster way is to unroll 5 iterations and do:
e += F(b, c, d) + K + rol32(a, 5) + W[i ]; b = rol32(b, 30);
d += F(a, b, c) + K + rol32(e, 5) + W[i+1]; a = rol32(a, 30);
c += F(e, a, b) + K + rol32(d, 5) + W[i+2]; e = rol32(e, 30);
b += F(d, e, a) + K + rol32(c, 5) + W[i+3]; d = rol32(d, 30);
a += F(c, d, e) + K + rol32(b, 5) + W[i+4]; c = rol32(c, 30);
then loop over that 4 times each. This is somewhat larger, but
still reasonably compact; only 20 of the 80 rounds are written out

I got this code from Nettle, originally, and I never looked at the SHA-1 round structure very closely. I'll give that approach a try.

--Benjamin Gilbert
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at