Re: [RFC] ASLA design, depth of code review and lack thereof

From: viro
Date: Mon Jun 07 2004 - 17:43:37 EST


On Mon, Jun 07, 2004 at 03:14:06PM +0200, Takashi Iwai wrote:
> > and return a pointer to a (8-element) row in array of patterns. Because
> > callers end up truncating the result and then filling a large area with
> > repeated copies. All we get from use of u_int64_t is extra PITA with
> > endianness - memcpy from 1/2/4/8 element array is no less efficient than
> > assignment from u8/u16/u32/u64.
>
> Is it true? If gcc really optmizes well like this, yes, surely we can
> use memcpy for simplicity.

__builtin_memcpy() is definitely smart enough for that: e.g. on x86 (and you
don't get much more register-starved than that)
void b(char *);
void a(char *x, int count)
{
char buf[8];
int i;
b(buf);
for (i = 0; i < count; i++) {
__builtin_memcpy(x, buf, 8);
x += 8;
}
}
will result (with -O2, which is normal for kernel) in
.L6:
movl %ecx, (%ebx)
movl %edx, 4(%ebx)
addl $8, %ebx
decl %eax
jne .L6
as the main loop, which gives you what you would get from use of u64.
And yes, constant-sized memcpy() in the kernel will be expanded to
__builtin_memcpy().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/