Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

From: Jason Gunthorpe
Date: Wed Dec 19 2018 - 11:56:34 EST


On Wed, Dec 19, 2018 at 12:35:40PM +0100, Jan Kara wrote:
> On Wed 19-12-18 21:28:25, Dave Chinner wrote:
> > On Tue, Dec 18, 2018 at 08:03:29PM -0700, Jason Gunthorpe wrote:
> > > On Wed, Dec 19, 2018 at 10:42:54AM +1100, Dave Chinner wrote:
> > >
> > > > Essentially, what we are talking about is how to handle broken
> > > > hardware. I say we should just brun it with napalm and thermite
> > > > (i.e. taint the kernel with "unsupportable hardware") and force
> > > > wait_for_stable_page() to trigger when there are GUP mappings if
> > > > the underlying storage doesn't already require it.
> > >
> > > If you want to ban O_DIRECT/etc from writing to file backed pages,
> > > then just do it.
> >
> > O_DIRECT IO *isn't the problem*.
>
> That is not true. O_DIRECT IO is a problem. In some aspects it is
> easier than the problem with RDMA but currently O_DIRECT IO can
> crash your machine or corrupt data the same way RDMA can. Just the
> race window is much smaller. So we have to fix the generic GUP
> infrastructure to make O_DIRECT IO work. I agree that fixing RDMA
> will likely require even more work like revokable leases or what
> not.

This is what I've understood, talking to all the experts. Dave? Why do
you think O_DIRECT is actually OK?

I agree the duration issue with RDMA is different, but don't forget,
O_DIRECT goes out to the network too and has potentially very long
timeouts as well.

If O_DIRECT works fine then lets use the same approach in RDMA??

Jason