Re: [PATCH] x86: Use entire page for the per-cpu GDT only if paravirt-enabled

From: Ingo Molnar
Date: Tue Sep 29 2015 - 05:01:29 EST

Next message: Benjamin Tissoires: "Re: [PATCH] HID: multitouch: Fetch feature reports on demand for Win8 devices"
Previous message: Peter Zijlstra: "Re: [v4.2+ regression] fd7a4bed sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks"
In reply to: Denys Vlasenko: "Re: [PATCH] x86: Use entire page for the per-cpu GDT only if paravirt-enabled"
Next in thread: Andy Lutomirski: "Re: [PATCH] x86: Use entire page for the per-cpu GDT only if paravirt-enabled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:

> On 09/28/2015 09:58 AM, Ingo Molnar wrote:
> >
> > * Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> >
> >> On 09/26/2015 09:50 PM, H. Peter Anvin wrote:
> >>> NAK. We really should map the GDT read-only on all 64 bit systems,
> >>> since we can't hide the address from SLDT. Same with the IDT.
> >>
> >> Sorry, I don't understand your point.
> >
> > So the problem is that right now the SGDT instruction (which is unprivileged)
> > leaks the real address of the kernel image:
> >
> > fomalhaut:~> ./sgdt
> > SGDT: ffff88303fd89000 / 007f
> >
> > that 'ffff88303fd89000' is a kernel address.
>
> Thank you.
> I do know that SGDT and friends are unprivileged on x86
> and thus they allow userspace (and guest kernels in paravirt)
> learn things they don't need to know.
>
> I don't see how making GDT page-aligned and page-sized
> changes anything in this regard. SGDT will still work,
> and still leak GDT address.

Well, as I try to explain it in the other part of my mail, doing so enables us to
remap the GDT to a less security sensitive virtual address that does not leak the
kernel's randomized address:

> > Your observation in the changelog and your patch:
> >
> >>>> It is page-sized because of paravirt. [...]
> >
> > ... conflicts with the intention to mark (remap) the primary GDT address read-only
> > on native kernels as well.
> >
> > So what we should do instead is to use the page alignment properly and remap the
> > GDT to a read-only location, and load that one.
>
> If we'd have a small GDT (i.e. what my patch does), we still can remap the
> entire page which contains small GDT, and simply don't care that some other data
> is also visible through that RO page.

That's generally considered fragile: suppose an attacker has a limited information
leak that can read absolute addresses with system privilege but he doesn't know
the kernel's randomized base offset. With a 'partial page' mapping there could be
function pointers near the GDT, part of the page the GDT happens to be on, that
leak this information.

(Same goes for crypto keys or other critical information (like canary information,
salts, etc.) accidentally ending up nearby.)

Arguably it's a bit tenuous, but when playing remapping games it's generally
considered good to be page aligned and page sized, with zero padding.

> > This would have a couple of advantages:
> >
> > - This would give kernel address randomization more teeth on x86.
> >
> > - An additional advantage would be that rootkits overwriting the GDT would have
> > a bit more work to do.
> >
> > - A third advantage would be that for NUMA systems we could 'mirror' the GDT into
> > node-local memory and load those. This makes GDT load cache-misses a bit less
> > expensive.
>
> GDT is per-cpu. Isn't per-cpu memory already NUMA-local?

Indeed it is:

fomalhaut:~> for ((cpu=1; cpu<9; cpu++)); do taskset $cpu ./sgdt ; done
SGDT: ffff88103fa09000 / 007f
SGDT: ffff88103fa29000 / 007f
SGDT: ffff88103fa29000 / 007f
SGDT: ffff88103fa49000 / 007f
SGDT: ffff88103fa49000 / 007f
SGDT: ffff88103fa49000 / 007f
SGDT: ffff88103fa29000 / 007f
SGDT: ffff88103fa69000 / 007f

I confused it with the IDT, which is still global.

This also means that the GDT in itself does not leak kernel addresses at the
moment, except it leaks the layout of the percpu area.

So my suggestion would be to:

- make the GDT unconditionally page aligned and sized, then remap it to a
read-only address unconditionally as well, like we do it for the IDT.

- make the IDT per CPU as well, for performance reasons.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Benjamin Tissoires: "Re: [PATCH] HID: multitouch: Fetch feature reports on demand for Win8 devices"
Previous message: Peter Zijlstra: "Re: [v4.2+ regression] fd7a4bed sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks"
In reply to: Denys Vlasenko: "Re: [PATCH] x86: Use entire page for the per-cpu GDT only if paravirt-enabled"
Next in thread: Andy Lutomirski: "Re: [PATCH] x86: Use entire page for the per-cpu GDT only if paravirt-enabled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]