Re: [PATCH] vfio: remove useless judgement

From: Jason Gunthorpe
Date: Tue Jun 28 2022 - 10:05:06 EST


On Tue, Jun 28, 2022 at 09:54:19AM -0400, Steven Sistare wrote:

> >> As you and I have discussed, the count is also wrong in the direct
> >> exec model, because exec clears mm->locked_vm.
> >
> > Really? Yikes, I thought exec would generate a new mm?
>
> Yes, exec creates a new mm with locked_vm = 0. The old locked_vm count is dropped
> on the floor. The existing dma points to the same task, but task->mm has changed,
> and dma->task->mm->locked_vm is 0. An unmap ioctl drives it
> negative.

Oh.. This is probably a bug, vfio should never use task->mm, the mm
itself should be held using mmgrab instead.

Otherwise exec case is broken as you describe.

> I have prototyped a few possible fixes. One changes vfio to use user->locked_vm.
> Another changes to mm->pinned_vm and preserves it during exec. A third preserves
> mm->locked_vm across exec, but that is not practical, because mm->locked_vm mixes
> vfio pins and mlocks. The mlock component must be cleared during exec, and we don't
> have a separate count for it.

Lossing locked_vm on exec/fork is the correct and expected behavior
for the core kernel code, the bug is that vfio drives it negative.

Jason