Re: [PATCH v2 00/17] x86/ldt: Use a VMA based read only mapping

From: Andy Lutomirski
Date: Thu Dec 14 2017 - 11:36:16 EST


On Thu, Dec 14, 2017 at 4:08 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, Dec 14, 2017 at 01:03:37PM +0100, Thomas Gleixner wrote:
>> On Thu, 14 Dec 2017, Peter Zijlstra wrote:
>> > So here's a second posting of the VMA based LDT implementation; now without
>> > most of the crazy.
>> >
>> > I took out the write fault handler and the magic LAR touching code.
>> >
>> > Additionally there are a bunch of patches that address generic vm issue.
>> >
>> > - gup() access control; In specific I looked at accessing !_PAGE_USER pages
>> > because these patches rely on not being able to do that.
>> >
>> > - special mappings; A whole bunch of mmap ops don't make sense on special
>> > mappings so disallow them.
>> >
>> > Both things make sense independent of the rest of the series. Similarly, the
>> > patches that kill that rediculous LDT inherit on exec() are also unquestionably
>> > good.
>> >
>> > So I think at least the first 6 patches are good, irrespective of the
>> > VMA approach.
>> >
>> > On the whole VMA approach, Andy I know you hate it with a passion, but I really
>> > rather like how it ties the LDT to the process that it belongs to and it
>> > reduces the amount of 'special' pages in the whole PTI mapping.
>> >
>> > I'm not the one going to make the decision on this; but I figured I at least
>> > post a version without the obvious crap parts of the last one.
>> >
>> > Note: if we were to also disallow munmap() for special mappings (which I
>> > suppose makes perfect sense) then we could further reduce the actual LDT
>> > code (we'd no longer need the sm::close callback and related things).
>>
>> That makes a lot of sense for the other special mapping users like VDSO and
>> kprobes.
>
> Right, and while looking at that I also figured it might make sense to
> unconditionally disallow splitting special mappings.
>
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2698,6 +2698,9 @@ int do_munmap(struct mm_struct *mm, unsi
> }
> vma = prev ? prev->vm_next : mm->mmap;
>
> + if (vma_is_special_mapping(vma))
> + return -EINVAL;
> +
> if (unlikely(uf)) {
> /*
> * If userfaultfd_unmap_prep returns an error the vmas
> @@ -3223,10 +3226,11 @@ static int special_mapping_fault(struct
> */
> static void special_mapping_close(struct vm_area_struct *vma)
> {
> - struct vm_special_mapping *sm = vma->vm_private_data;
> +}
>
> - if (sm->close)
> - sm->close(sm, vma);
> +static int special_mapping_split(struct vm_area_struct *vma, unsigned long addr)
> +{
> + return -EINVAL;
> }
>
> static const char *special_mapping_name(struct vm_area_struct *vma)
> @@ -3252,6 +3256,7 @@ static const struct vm_operations_struct
> .fault = special_mapping_fault,
> .mremap = special_mapping_mremap,
> .name = special_mapping_name,
> + .split = special_mapping_split,
> };
>
> static const struct vm_operations_struct legacy_special_mapping_vmops = {

Disallowing splitting seems fine. Disallowing munmap might not be.
Certainly CRIU relies on being able to mremap() the VDSO.