Re: [PATCH v2 07/36] x86/entry/64/compat: After SYSENTER, move STI after the NT fixup

From: Andy Lutomirski
Date: Wed Oct 07 2015 - 15:02:47 EST


On Wed, Oct 7, 2015 at 10:39 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> On 10/06/2015 02:47 AM, Andy Lutomirski wrote:
>> We eventually want to make it all the way into C code before
>> enabling interrupts. We need to rework our flags handling slightly
>> to delay enabling interrupts.
>>
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> ---
>> arch/x86/entry/entry_64_compat.S | 30 ++++++++++++++++++++++--------
>> 1 file changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
>> index aa76864a8a6b..1432d60a1f4a 100644
>> --- a/arch/x86/entry/entry_64_compat.S
>> +++ b/arch/x86/entry/entry_64_compat.S
>> @@ -58,14 +58,9 @@ ENDPROC(native_usergs_sysret32)
>> * with the int 0x80 path.
>> */
>> ENTRY(entry_SYSENTER_compat)
>> - /*
>> - * Interrupts are off on entry.
>> - * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
>> - * it is too small to ever cause noticeable irq latency.
>> - */
>> + /* Interrupts are off on entry. */
>> SWAPGS_UNSAFE_STACK
>> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>> - ENABLE_INTERRUPTS(CLBR_NONE)
>>
>> /* Zero-extending 32-bit regs, do not remove */
>> movl %ebp, %ebp
>> @@ -76,7 +71,16 @@ ENTRY(entry_SYSENTER_compat)
>> /* Construct struct pt_regs on stack */
>> pushq $__USER32_DS /* pt_regs->ss */
>> pushq %rbp /* pt_regs->sp */
>> - pushfq /* pt_regs->flags */
>> +
>> + /*
>> + * Push flags. This is nasty. First, interrupts are currently
>> + * off, but we need pt_regs->flags to have IF set. Second, even
>> + * if TF was set when SYSENTER started, it's clear by now. We fix
>> + * that later using TIF_SINGLESTEP.
>> + */
>> + pushfq /* pt_regs->flags (except IF = 0) */
>> + orl $X86_EFLAGS_IF, (%rsp) /* Fix saved flags */
>
> The sequence of "push + insn_using_rsp" is a bit slow
> on most CPUs because stack engine (the machinery which makes
> consecutive pushes fast) needs syncronizing with register file.
>
> It may be better to move the ORL insn here:
>
> push, push, push
> cld
> sub $(10*8), %rsp /* pt_regs->r8-11, bp, bx, r12-15 not saved */
> + orl $X86_EFLAGS_IF, EFLAGS(%rsp) /* Fix saved flags to have .IF = 1 */
>
> where we already eat that penalty.
>
>

I'll benchmark this and, if it's a win, I'll tack it on to the end of
the series.

--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/