Re: [PATCH] x86: use pgd accessors when cloning a pgd range.

From: Jeremy Fitzhardinge
Date: Wed Oct 27 2010 - 13:52:00 EST


On 10/27/2010 10:42 AM, H. Peter Anvin wrote:
> On 10/27/2010 10:31 AM, Jeremy Fitzhardinge wrote:
>> On 10/27/2010 10:18 AM, H. Peter Anvin wrote:
>>> On 10/27/2010 9:50 AM, Jeremy Fitzhardinge wrote:
>>>>
>>>> This never used to be a problem. Perhaps we can change how
>>>> clone_pgd_range is used at boot time to avoid it in the Xen case
>>>> (since
>>>> we don't care about the secondary pagetable)?
>>>>
>>>
>>> Xen shouldn't have any users of this, since it's used for low-level
>>> operations like SMP bootstrap, suspend to RAM, reboot and low-level
>>> BIOS functionality.
>>>
>>
>> Right, but it is being called smack in the middle of setup_arch(). It
>> looks like they could be hidden away in
>> native_pagetable_setup_start/done though.
>>
>
> This is what makes me absolutely hate paravirt with a passion...
> "let's hid things away in <obscure place> and make it absolutely
> impossible to either follow the code flow or figure out what the
> intended semantics are supposed to be."

Its not really an obscure place; it's where x86-32 does the rest of its
boot-time pagetable adjustments (like cleaning out the low identity
maps, etc). Having those clone_pgd_ranges() floating around in
setup_arch() is out of place.

> (Let not even get me started on how ill-defined the semantics of some
> of the paravirt operations are.) In this case, at the most you need a
> single flag of state... or you could even just ignore this low-level
> data structure that you will never use in the first place. Ian's
> message just mentioned "a failure" and never described in any way what
> kind of "failure" it was.

It would be a pagefault from Xen preventing a direct write to the pgd
level of an active pagetable. At the point in setup_arch() where it
does the first clone_pgd_range() we're already running on swapper_pg_dir
and the copy from initial_page_table is outright wrong.

As Ian suggests, we could switch Xen to use initial_page_table at boot
then move to swapper_pg_dir in the same way native does.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/