Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3

From: Peter Zijlstra
Date: Mon Aug 21 2017 - 06:34:20 EST


On Thu, Aug 17, 2017 at 08:52:38AM -0700, Andy Lutomirski wrote:
> On Thu, Aug 17, 2017 at 3:35 AM, Will Deacon <will.deacon@xxxxxxx> wrote:

> > I'm still concerned that we're treating perf specially here -- are we
> > absolutely sure that nobody else is going to attempt user accesses off the
> > back of an interrupt?
>
> Reasonably sure? If nothing else, an interrupt taken while mmap_sem()
> is held for write that tries to access user memory is asking for
> serious trouble. There are still a few callers of pagefault_disable()
> and copy...inatomic(), though.

I'm not immediately seeing how holding mmap_sem for writing is a
problem.

> > If not, then I'd much prefer a solution that catches
> > anybody doing that with the EFI page table installed, rather than trying
> > to play whack-a-mole like this.
>
> Using a kernel thread solves the problem for real. Anything that
> blindly accesses user memory in kernel thread context is terminally
> broken no matter what.

So perf-callchain doesn't do it 'blindly', it wants either:

- user_mode(regs) true, or
- task_pt_regs() set.

However I'm thinking that if the kernel thread has ->mm == &efi_mm, the
EFI code running could very well have user_mode(regs) being true.

intel_pmu_pebs_fixup() OTOH 'blindly' assumes that the LBR addresses are
accessible. It bails on error though. So while its careful, it does
attempt to access the 'user' mapping directly. Which should also trigger
with the EFI code.

And I'm not seeing anything particularly broken with either. The PEBS
fixup relies on the CPU having just executed the code, and if it could
fetch and execute the code, why shouldn't it be able to fetch and read?
(eXecute implies Read assumed). And like said, it if triggers a fault,
it bails, no worries.

It really doesn't care if the task is a kernel thread or not. Same for
the unwinder, if we get an interrupt register set that points into
'userspace' we try and unwind it.