Re: [PATCH 02/23] x86, kaiser: do not set _PAGE_USER for init_mm page tables

From: Thomas Gleixner
Date: Thu Nov 02 2017 - 07:33:45 EST


On Thu, 2 Nov 2017, Andy Lutomirski wrote:
> On Wed, Nov 1, 2017 at 3:20 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > On Wed, 1 Nov 2017, Linus Torvalds wrote:
> >> On Wed, Nov 1, 2017 at 2:52 PM, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:
> >> > On 11/01/2017 02:28 PM, Thomas Gleixner wrote:
> >> >> On Wed, 1 Nov 2017, Andy Lutomirski wrote:
> >> >>> The vsyscall page is _PAGE_USER and lives in init_mm via the fixmap.
> >> >>
> >> >> Groan, forgot about that abomination, but still there is no point in having
> >> >> it marked PAGE_USER in the init_mm at all, kaiser or not.
> >> >
> >> > So shouldn't this patch effectively make the vsyscall page unusable?
> >> > Any idea why that didn't show up in any of the x86 selftests?
> >>
> >> I actually think there may be two issues here:
> >>
> >> - vsyscall isn't even used much - if any - any more
> >
> > Only legacy user space uses it.
> >
> >> - the vsyscall emulation works fine without _PAGE_USER, since the
> >> whole point is that we take a fault on it and then emulate.
> >>
> >> We do expose the vsyscall page read-only to user space in the
> >> emulation case, but I'm not convinced that's even required.
> >
> > I don't see a reason why it needs to be mapped at all for emulation.
>
> At least a couple years ago, the maintainers of some userspace tracing
> tools complained very loudly at the early versions of the patches.
> There are programs like pin (semi-open-source IIRC) that parse
> instructions, make an instrumented copy, and run it. This means that
> the vsyscall page needs to contain text that is semantically
> equivalent to what calling it actually does.
>
> So yes, read access needs to work. I should add a selftest for this.
>
> This is needed in emulation mode as well as native mode, so removing
> native mode is totally orthogonal.

Fair enough. I enabled function tracing with emulate_vsyscall as the filter
on a couple of machines and so far I have no hit at all. Though I found a
VM with a real old user space (~2005) and that actually used it.

So for the problem at hand, I'd suggest we disable the vsyscall stuff if
CONFIG_KAISER=y and be done with it.

Thanks,

tglx