Re: [PATCH resend RFC 0/9] s390: fixes, cleanups and optimizations for page table walkers

From: Claudio Imbrenda
Date: Tue Sep 14 2021 - 12:50:53 EST


On Thu, 9 Sep 2021 18:22:39 +0200
David Hildenbrand <david@xxxxxxxxxx> wrote:

> Resend because I missed ccing people on the actual patches ...
>
> RFC because the patches are essentially untested and I did not actually
> try to trigger any of the things these patches are supposed to fix. It

this is an interesting series, and the code makes sense, but I would
really like to see some regression tests, and maybe even some
selftests to trigger (at least some of) the issues.

the follow-up question is: how did we manage to go on so long without
noticing these issues? :D

> merely matches my current understanding (and what other code does :) ). I
> did compile-test as far as possible.
>
> After learning more about the wonderful world of page tables and their
> interaction with the mmap_sem and VMAs, I spotted some issues in our
> page table walkers that allow user space to trigger nasty behavior when
> playing dirty tricks with munmap() or mmap() of hugetlb. While some issues
> should be hard to trigger, others are fairly easy because we provide
> conventient interfaces (e.g., KVM_S390_GET_SKEYS and KVM_S390_SET_SKEYS).
>
> Future work:
> - Don't use get_locked_pte() when it's not required to actually allocate
> page tables -- similar to how storage keys are now handled. Examples are
> get_pgste() and __gmap_zap.
> - Don't use get_locked_pte() and instead let page fault logic allocate page
> tables when we actually do need page tables -- also, similar to how
> storage keys are now handled. Examples are set_pgste_bits() and
> pgste_perform_essa().
> - Maybe switch to mm/pagewalk.c to avoid custom page table walkers. For
> __gmap_zap() that's very easy.
>
> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
> Cc: Cornelia Huck <cohuck@xxxxxxxxxx>
> Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
> Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
> Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> Cc: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
> Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
> Cc: Ulrich Weigand <Ulrich.Weigand@xxxxxxxxxx>
>
> David Hildenbrand (9):
> s390/gmap: validate VMA in __gmap_zap()
> s390/gmap: don't unconditionally call pte_unmap_unlock() in
> __gmap_zap()
> s390/mm: validate VMA in PGSTE manipulation functions
> s390/mm: fix VMA and page table handling code in storage key handling
> functions
> s390/uv: fully validate the VMA before calling follow_page()
> s390/pci_mmio: fully validate the VMA before calling follow_pte()
> s390/mm: no need for pte_alloc_map_lock() if we know the pmd is
> present
> s390/mm: optimize set_guest_storage_key()
> s390/mm: optimize reset_guest_reference_bit()
>
> arch/s390/kernel/uv.c | 2 +-
> arch/s390/mm/gmap.c | 11 +++-
> arch/s390/mm/pgtable.c | 109 +++++++++++++++++++++++++++------------
> arch/s390/pci/pci_mmio.c | 4 +-
> 4 files changed, 89 insertions(+), 37 deletions(-)
>
>
> base-commit: 7d2a07b769330c34b4deabeed939325c77a7ec2f