Re: [PATCH 1/2] mm: allow for an alternate set of pages for userspace mappings

From: David Vrabel
Date: Thu Jan 08 2015 - 12:50:34 EST


On 08/01/15 17:20, Johannes Weiner wrote:
> On Thu, Jan 08, 2015 at 03:28:43PM +0000, David Vrabel wrote:
>> Add an optional array of pages to struct vm_area_struct that can be
>> used find the page backing a VMA. This is useful in cases where the
>> normal mechanisms for finding the page don't work. This array is only
>> inspected if the PTE is special.
>>
>> Splitting a VMA with such an array of pages is trivially done by
>> adjusting vma->pages. The original creator of the VMA must only free
>> the page array once all sub-VMAs are closed (e.g., by ref-counting in
>> vm_ops->open and vm_ops->close).
>>
>> One use case is a Xen PV guest mapping foreign pages into userspace.
>>
>> In a Xen PV guest, the PTEs contain MFNs so get_user_pages() (for
>> example) must do an MFN to PFN (M2P) lookup before it can get the
>> page. For foreign pages (those owned by another guest) the M2P lookup
>> returns the PFN as seen by the foreign guest (which would be
>> completely the wrong page for the local guest).
>>
>> This cannot be fixed up improving the M2P lookup since one MFN may be
>> mapped onto two or more pages so getting the right page is impossible
>> given just the MFN.
[...]
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -309,6 +309,14 @@ struct vm_area_struct {
>> #ifdef CONFIG_NUMA
>> struct mempolicy *vm_policy; /* NUMA policy for the VMA */
>> #endif
>> + /*
>> + * Array of pages to override the default vm_normal_page()
>> + * result iff the PTE is special.
>> + *
>> + * The memory for this should be refcounted in vm_ops->open
>> + * and vm_ops->close.
>> + */
>> + struct page **pages;
>
> Please make this configuration-dependent, not every Linux user should
> have to pay for a Xen optimization.

If the additional field in struct vm_area_struct is a concern, I would
prefer to use a vm_flag bit and union pages with an existing field.

Perhaps using VM_PFNMAP and reusing vm_file?

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/