Re: [RFC 00/15] x86_64: Optimize percpu accesses

From: H. Peter Anvin
Date: Thu Jul 10 2008 - 11:49:27 EST


Christoph Lameter wrote:

Well the zero based results in this becoming always

gs_base + absolute address in per cpu segment

You can do either way. For RIP-based, you have to worry about the possible range for the RIP register when referencing. Currently, even for "make allyesconfig" the per cpu segment is a lot smaller than the minimum value for CONFIG_PHYSICAL_START (2 MB), so there is no issue, but there is a distinct lack of wiggle room, which can be resolved either by using negative offsets, or by moving the kernel text area up a bit from -2 GB.

Why are RIP based references cheaper? The offset to the per cpu segment is certainly more than what can be fit into 16 bits.

Where are you getting 16 bits from?!?! *There are no 16-bit offsets in 64-bit mode, period, full stop.*

RIP-based references are cheaper because the x86-64 architects chose to optimize RIP-based references over absolute references. Therefore RIP-based references are encodable with only a MODR/M byte, whereas absolute references require a SIB byte as well -- longer instruction, possibly a less optimized path through the CPU, and *definitely* something that gets exercised less in the linker.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/