Re: [PATCH] vfio/pci: Handle concurrent vma faults

From: Jason Gunthorpe
Date: Fri Mar 12 2021 - 14:42:49 EST


On Fri, Mar 12, 2021 at 12:16:11PM -0700, Alex Williamson wrote:
> On Wed, 10 Mar 2021 14:40:11 -0400
> Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
>
> > On Wed, Mar 10, 2021 at 11:34:06AM -0700, Alex Williamson wrote:
> >
> > > > I think after the address_space changes this should try to stick with
> > > > a normal io_rmap_pfn_range() done outside the fault handler.
> > >
> > > I assume you're suggesting calling io_remap_pfn_range() when device
> > > memory is enabled,
> >
> > Yes, I think I saw Peter thinking along these lines too
> >
> > Then fault just always causes SIGBUS if it gets called
>
> Trying to use the address_space approach because otherwise we'd just be
> adding back vma list tracking, it looks like we can't call
> io_remap_pfn_range() while holding the address_space i_mmap_rwsem via
> i_mmap_lock_write(), like done in unmap_mapping_range(). lockdep
> identifies a circular lock order issue against fs_reclaim. Minimally we
> also need vma_interval_tree_iter_{first,next} exported in order to use
> vma_interval_tree_foreach(). Suggestions? Thanks,

You are asking how to put the BAR back into every VMA when it is
enabled again after it has been zap'd?

What did the lockdep splat look like? Is it a memory allocation?

Does current_gfp_context()/memalloc_nofs_save()/etc solve it?

The easiest answer is to continue to use fault and the
vmf_insert_page()..

But it feels like it wouuld be OK to export enough i_mmap machinery to
enable this. Cleaner than building your own tracking, which would
still have the same ugly mmap_sem inversion issue which was preventing
this last time.

Jason