Re: [PATCH 01/11] x86: entry_64.S: always allocate complete "struct pt_regs"

From: Denys Vlasenko
Date: Wed Feb 11 2015 - 16:56:01 EST


On 02/11/2015 09:30 PM, Andy Lutomirski wrote:
> On Wed, Jan 14, 2015 at 1:48 PM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>> 64-bit code was using six stack slots less by not saving/restoring
>> registers which are callee-preserved according to C ABI,
>> and not allocating space for them.
>> Only when syscall needed a complete "struct pt_regs",
>> the complete area was allocated and filled in.
>> As an additional twist, on interrupt entry a "slightly less truncated pt_regs"
>> trick is used, to make nested interrupt stacks easier to unwind.
>>
>> This proved to be a source of significant obfuscation and subtle bugs.
>> For example, stub_fork had to pop the return address,
>> extend the struct, save registers, and push return address back. Ugly.
>> ia32_ptregs_common pops return address and "returns" via jmp insn,
>> throwing a wrench into CPU return stack cache.
>>
>> This patch changes code to always allocate a complete "struct pt_regs".
>> The saving of registers is still done lazily.
>>
>> "Partial pt_regs" trick on interrupt stack is retained, but with added comments
>> which explain what we are doing, and why. Existing comments are improved.
>>
>> Macros which manipulate "struct pt_regs" on stack are reworked:
>> ALLOC_PT_GPREGS_ON_STACK allocates the structure.
>> SAVE_C_REGS saves to it those registers which are clobbered by C code.
>> SAVE_EXTRA_REGS saves to it all other registers.
>> Corresponding RESTORE_* and REMOVE_PT_GPREGS_FROM_STACK macros reverse it.
>>
>> ia32_ptregs_common, stub_fork and friends lost their ugly dance with
>> return pointer.
>>
>> LOAD_ARGS32 in ia32entry.S now uses symbolic stack offsets
>> instead of magic numbers.
>>
>> error_entry and save_paranoid now use SAVE_C_REGS + SAVE_EXTRA_REGS
>> instead of having it open-coded yet again.
>>
>> Misleading and slightly wrong comments in "struct pt_regs" are fixed
>> (four instances).
>>
>> Patch was run-tested: 64-bit executables, 32-bit executables,
>> strace works.
>> Timing tests did not show measurable difference in 32-bit
>> and 64-bit syscalls.
>
> I like this. However, can you split it? I think it would be easier
> to review and more bisect-friendly if there were first a patch to do
> all the macro and comment cleanup without any layout changes followed
> by bite-sized layout changes.

Changes in macro names are intentionally made in a single patch,
together with significant changes to their meanings.
Renaming macros makes sure I reviewed all uses - otherwise
(if I missed a macro use somewhere) it would just not compile.

LOAD_ARGS32 change to use R9 etc instead of magic numbers like 16
also can't be split up. The point is, 16 is not equal to R9 (which is 64),
it only becomes equal because LOAD_ARGS32 is invoked with offset=ARGOFFSET,
and ARGOFFSET+16 = R9. The patch sets ARGOFSSET to 0, thus only after this
patch, I can replace magic number with R9. But if I'd do that,
then _this_ patch would still need to replace one magic number (16) -
with another magic number, 64, to not break stuff.
It makes sense, then, to not do the change there twice, in two patches,
and instead just bite the bullet and replace 16 with R9 in this patch.

In a few minutes I'll send you two patches which resulted from splitting
comment changes into a separate patch.

--
vda

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/