Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 currenttask

From: Linus Torvalds
Date: Thu Nov 29 2007 - 14:10:29 EST




On Thu, 29 Nov 2007, H. Peter Anvin wrote:
> Linus Torvalds wrote:
> >
> > > It is advantageous for user space to use the register the kernel typically
> > > won't, in order to speed up system call entry/exit.
> >
> > but I'm not seeing the reason for that one. Care to comment more? (Yes,
> > there is often a latency from segment reload to use, but the reload latency
> > for system call exit *should* be entirely covered by the cost of doing the
> > system call return itself, no?)
>
> I do seem to recall that some processor implementations can load a NULL
> segment faster than a non-NULL segment. This was significant enough that we
> wanted to use %fs in x86-64 userspace, as opposed to the original ABI which
> used %gs both in userspace and in the kernel.

Ahh, I think you may be right for some CPUs. The zero selector is indeed
potentially faster to load, since it doesn't have to even bother looking
at the GDT/LDT.

That said, I doubt it's very noticeable. I just ran tests on both an old
P4 and on a more modern Core 2 machine, and for both of those the
performance was identical between loading a NUL selector and loading it
with a non-zero one.

But I could well imagine that it matters a few cycles on other CPU's. But
from my testing, it definitely isn't noticeable, and I think the
maintenance advantage of using the same segment setup would more than make
up for the fact that maybe some odd CPU can see a difference.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/