RE: [PATCH 1/1] iommu/sva: Invalidate KVA range on kernel TLB flush

From: Tian, Kevin
Date: Mon Jul 14 2025 - 20:05:55 EST


> From: Mike Rapoport <rppt@xxxxxxxxxx>
> Sent: Monday, July 14, 2025 10:50 PM
>
> On Mon, Jul 14, 2025 at 03:19:17PM +0200, Uladzislau Rezki wrote:
> > On Mon, Jul 14, 2025 at 01:39:20PM +0100, David Laight wrote:
> > > On Wed, 9 Jul 2025 11:22:34 -0700
> > > Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> > >
> > > > On 7/9/25 11:15, Jacob Pan wrote:
> > > > >>> Is there a use case where a SVA user can access kernel memory in
> the
> > > > >>> first place?
> > > > >> No. It should be fully blocked.
> > > > >>
> > > > > Then I don't understand what is the "vulnerability condition" being
> > > > > addressed here. We are talking about KVA range here.
> > > >
> > > > SVA users can't access kernel memory, but they can compel walks of
> > > > kernel page tables, which the IOMMU caches. The trouble starts if the
> > > > kernel happens to free that page table page and the IOMMU is using
> the
> > > > cache after the page is freed.
> > > >
> > > > That was covered in the changelog, but I guess it could be made a bit
> > > > more succinct.
>
> But does this really mean that every flush_tlb_kernel_range() should flush
> the IOMMU page tables as well? AFAIU, set_memory flushes TLB even when
> bits
> in pte change and it seems like an overkill...
>
> > > Is it worth just never freeing the page tables used for vmalloc() memory?
> > > After all they are likely to be reallocated again.
> > >
> > >
> > Do we free? Maybe on some arches? According to my tests(AMD x86-64) i
> did
> > once upon a time, the PTE entries were not freed after vfree(). It could be
> > expensive if we did it, due to a global "page_table_lock" lock.
> >
> > I see one place though, it is in the vmap_try_huge_pud()
> >
> > if (pud_present(*pud) && !pud_free_pmd_page(pud, addr))
> > return 0;
> >
> > it is when replace a pud by a huge-page.
>
> There's also a place that replaces a pmd by a smaller huge page, but other
> than that vmalloc does not free page tables.
>

Dave spotted two other places where page tables might be freed:

https://lore.kernel.org/all/62580eab-3e68-4132-981a-84167d130d9f@xxxxxxxxx/