Re: [GIT PULL] Please pull NFS client bugfixes....

From: Trond Myklebust
Date: Thu Jan 07 2010 - 20:22:43 EST


On Thu, 2010-01-07 at 17:12 -0800, Linus Torvalds wrote:
>
> On Thu, 7 Jan 2010, Trond Myklebust wrote:
> > >
> > > Because it means that you can trivially take page faults before the thing
> > > is validated (think threads).
> >
> > Which would mean that another process/thread already has part of the
> > file mmapped on the same client. I'm not arguing that have to revalidate
> > in _that_ case.
>
> No, I'm talking about the new mapping. Nothing else.
>
> If the mmap'ing thread releases mmap_sem, and then does the revalidate,
> then you can have
>
> thread1 thread2
> ------- -------
>
> mmap
> map it in
> release mmap_sem
> page-fault the mapping before it got validated
> ->post_mmap()
> revalidate outside mmap_sem
>
> See? No "already part of the file mmapped" case at all. The exact mmap
> that you just set up - without the revalidation having happened.
>
> In fact, because of this kind of _fundamental_ race, I don't see why I
> would ever accept any patches that add multiple mmap() down-calls at
> different phases to the filesystem at the VFS layer.
>
> A filesystem that depends on the different phases would be a fundamentally
> buggy filesystem. Right now mmap is "atomic", and you can pre-populate (or
> pre-verify, like NFS does) the mapping in the _knowledge_ that there are
> no page faults that will populate it concurrently. Exactly because we hold
> the mmap_sem for writing.

I don't think anyone has been advocating doing the revalidation _after_
the call to mmap_region(). All I want is to be able to do it as part of
the mmap() syscall. It would be quite OK to add a ->pre_mmap() (which is
what I believe Peter's patches do).

All I want to ensure is that people who use non-posix-lock based
synchronisation can set the 'noac' flag, and be assured that if mmap()
is called _after_ they have grabbed their lock, then the page cache will
be duly revalidated (under the lock), and the fresh data will be made
available.

Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/