Re: [PATCH] x86/arch_prctl/64: restore accidentally removed put_cpu in ARCH_SET_GS

From: Andy Lutomirski
Date: Wed May 11 2016 - 20:00:49 EST


On May 11, 2016 1:35 PM, "Mateusz Guzik" <mguzik@xxxxxxxxxx> wrote:
>
> On Tue, May 10, 2016 at 01:58:24PM -0700, Andy Lutomirski wrote:
> > On Tue, May 10, 2016 at 1:56 PM, Mateusz Guzik <mguzik@xxxxxxxxxx> wrote:
> > > This fixes 731e33e39a5b95ad770 "Remove FSBASE/GSBASE < 4G optimization"
> >
> > Indeed. How did that survive lockdep?
> >
>
> lockdep_sys_exit only checks actual locks.
>
> In the common path after return from particular syscall interrupts get
> blindly disabled (as opposed to checking first that they are enabled).
> preemption count is not checked in the fast path at all and is checked
> elsewhere as a side effect of calls to e.g. schedule().
>
> How about a hack along these lines (note I don't claim this is
> committable as it is, but it should work):
>
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index ec138e5..5887bc7 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -303,6 +303,24 @@ static void syscall_slow_exit_work(struct pt_regs *regs, u32 cached_flags)
> tracehook_report_syscall_exit(regs, step);
> }
>
> +#ifdef CONFIG_PROVE_LOCKING
> +/*
> + * Called after syscall handlers return.
> + */
> +__visible void syscall_assert_exit(struct pt_regs *regs)
> +{
> + if (in_atomic() || irqs_disabled()) {
> + printk(KERN_ERR "invalid state on exit from syscall %ld: "
> + "in_atomic(): %d, irqs_disabled(): %d, pid: %d, "
> + "name: %s\n", regs->orig_ax, in_atomic(),
> + irqs_disabled(), current->pid, current->comm);
> + }
> +
> + if (irqs_disabled())
> + local_irq_enable();
> +}
> +#endif
> +
> /*
> * Called with IRQs on and fully valid regs. Returns with IRQs off in a
> * state such that we can immediately switch to user mode.
> @@ -314,9 +332,7 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs)
>
> CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
>
> - if (IS_ENABLED(CONFIG_PROVE_LOCKING) &&
> - WARN(irqs_disabled(), "syscall %ld left IRQs disabled", regs->orig_ax))
> - local_irq_enable();
> + syscall_assert_exit(regs);
>
> /*
> * First do one-time work. If these work items are enabled, we
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 9ee0da1..6c5cc23 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -210,6 +210,12 @@ entry_SYSCALL_64_fastpath:
> movq %rax, RAX(%rsp)
> 1:
>
> +#ifdef CONFIG_PROVE_LOCKING
> + /*
> + * We want to validate bunch of stuff, which will clobber registers.
> + */
> + jmp 2f
> +#endif
> /*
> * If we get here, then we know that pt_regs is clean for SYSRET64.
> * If we see that no exit work is required (which we are required
> @@ -236,6 +242,7 @@ entry_SYSCALL_64_fastpath:
> */
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_NONE)
> +2:
> SAVE_EXTRA_REGS
> movq %rsp, %rdi
> call syscall_return_slowpath /* returns with IRQs disabled */

It would be nice to do this in a cross-arch way. Maybe we could
extend lockdep_sys_exit? Ingo, do you think that would be reasonable?

--Andy

>
> --
> Mateusz Guzik