Re: [PATCH RFC] x86: check for and defend against BIOS memory corruption

From: Jeremy Fitzhardinge
Date: Mon Sep 08 2008 - 13:17:22 EST

Hugh Dickins wrote:
> On Fri, 29 Aug 2008, Jeremy Fitzhardinge wrote:
>> Hugh Dickins wrote:
>>> Is this the right moment for me to mention again that I'm not sure
>>> your reuse of existing pagetables was quite right anyway: NX being
>>> excluded from level2_ident_pgt, but wanted in the direct map?
>> We could add NX. What's the behaviour of setting NX in a
>> non-NX-supporting CPU? I don't think it would trigger a "reserved bit"
>> exception (the other high pte flags don't). Or failing that, we could
>> mask out NX once we've worked out the CPU doesn't support it (at the
>> same time it relocates the pagetables to the kernel's load-time address).
> I've no experience of what happens if NX is set to a non-NX-supporting
> CPU, so can't advise on that at all. But I think you're looking at
> it the wrong way round - or else I am.
> Here's the declaration and comment on level2_ident_pgt in head_64.S:
> NEXT_PAGE(level2_ident_pgt)
> /* Since I easily can, map the first 1G.
> * Don't set NX because code runs from these pages.
> */
> (The "Since I easily can" comment is there because it used to map
> only a subset needed for early kernel startup, not the whole 1G.)
> So it's very intentionally leaving NX out there. I believe the
> level2_ident_pgt page appears twice or more in the pagetable layout,
> used to map two or more areas of virtual address space - once to
> provide the direct map at ffff880000000000 and once to provide
> the kernel image virtual mapping at ffffffff80200000.

Hm, yes, I see what you mean.

> I think we don't want NX on Linux kernel text ;-?

Might be problematic.

> Before your 2.6.27-rc changes, init_memory_mapping subsequently
> replaced the direct map usage by a separately constructed pagetable,
> similar to it but with NX set throughout. After your 2.6.27-rc
> changes, level2_ident_pgt is found there already so left untouched -
> leaving NX out of that first 1G of the direct map forever (when CPA
> splits it up, the smaller pages inherit the lack of NX too).

Well, if we never want the direct map to be non-executable (which I
think would be OK, since all the code is either core kernel or modules
which are mapped elsewhere), then we can set NX on the level4 for the
linear mapping which will make everything below non-executable.

> I'm reluctant to delve in there again at present, and unsure
> what the right fix should be: perhaps the code which checks if
> an entry is already there, should check if it has the desired
> flags?

Well, the specific reason I made these changes was to make sure that
there was never more than one entry mapping any kernel page, so that you
can update the page permissions on a kernel page with just one update.
This is pretty much a requirement for Xen, and very convenient at other
times. Native will use either L2 or L3 mappings for the kernel and
linear space, and Xen uses L1 mappings, so if we break the aliasing at
the L2 level I can still keep the L1s aliased, but it seems simpler to
set NX on the linear mapping's L4.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at