Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

From: John Hubbard
Date: Wed Aug 21 2019 - 14:24:17 EST


On 8/21/19 11:13 AM, Jason Gunthorpe wrote:
On Wed, Aug 21, 2019 at 11:02:00AM -0700, Ira Weiny wrote:
On Tue, Aug 20, 2019 at 08:55:15AM -0300, Jason Gunthorpe wrote:
On Tue, Aug 20, 2019 at 11:12:10AM +1000, Dave Chinner wrote:
On Mon, Aug 19, 2019 at 09:38:41AM -0300, Jason Gunthorpe wrote:
On Mon, Aug 19, 2019 at 07:24:09PM +1000, Dave Chinner wrote:

So that leaves just the normal close() syscall exit case, where the
application has full control of the order in which resources are
released. We've already established that we can block in this
context. Blocking in an interruptible state will allow fatal signal
delivery to wake us, and then we fall into the
fatal_signal_pending() case if we get a SIGKILL while blocking.

The major problem with RDMA is that it doesn't always wait on close() for the
MR holding the page pins to be destoyed. This is done to avoid a
deadlock of the form:

uverbs_destroy_ufile_hw()
mutex_lock()
[..]
mmput()
exit_mmap()
remove_vma()
fput();
file_operations->release()

I think this is wrong, and I'm pretty sure it's an example of why
the final __fput() call is moved out of line.

Yes, I think so too, all I can say is this *used* to happen, as we
have special code avoiding it, which is the code that is messing up
Ira's lifetime model.

Ira, you could try unraveling the special locking, that solves your
lifetime issues?

Yes I will try to prove this out... But I'm still not sure this fully solves
the problem.

This only ensures that the process which has the RDMA context (RDMA FD) is safe
with regard to hanging the close for the "data file FD" (the file which has
pinned pages) in that _same_ process. But what about the scenario.

Oh, I didn't think we were talking about that. Hanging the close of
the datafile fd contingent on some other FD's closure is a recipe for
deadlock..

IMHO the pin refcnt is held by the driver char dev FD, that is the
object you need to make it visible against.


If you do that, it might make it a lot simpler to add lease support
to drivers like XDP, which is otherwise looking pretty difficult to
set up with an fd. (It's socket-based, and not immediately clear where
to connect up the fd.)


thanks,
--
John Hubbard
NVIDIA


Why not just have a single table someplace of all the layout leases
with the file they are held on and the FD/socket/etc that is holding
the pin? Make it independent of processes and FDs?

Jason