Re: [RFC][PATCH 0/6] IO pinning(get_user_pages()) vs fork race fix

From: Nick Piggin
Date: Tue Apr 14 2009 - 05:37:44 EST


On Tuesday 14 April 2009 19:19:10 KOSAKI Motohiro wrote:

> > I don't see how it fixes vmsplice? vmsplice can get_user_pages pages from one
> > process's address space and put them into a pipe, and they are released by
> > another process after consuming the pages I think. So it's fairly hard to hold
> > a lock over this.
>
> I recognize my explanation is poor.
>
> firstly, pipe_to_user() via vmsplice_to_user use copy_to_user. then we don't need care
> receive side.
> secondly, get_iovec_page_array() via vmsplice_to_pipe() use gup(read).
> then we only need prevent to change the page.
>
> I changed reuse_swap_page() at [1/6]. then if any process touch the page while
> the process isn't recived yet, it makes COW break and toucher get copyed page.
> then, Anybody can't change original page.
>
> Thus, This patch series also fixes vmsplice issue, I think.
> Am I missing anything?

Ah thanks, I see now. No I don't think you're missing anything.


> > I guess apart from the vmsplice issue (unless I missed a clever fix), I guess
> > this *does* work. I can't see any races... I'd really still like to hear a good
> > reason why my proposed patch is so obviously crap.
> >
> > Reasons proposed so far:
> > "No locking" (I think this is a good thing; no *bugs* have been pointed out)
> > "Too many page flags" (but it only uses 1 anon page flag, only fs pagecache
> > has a flags shortage so we can easily overload a pagecache flag)
> > "Diffstat too large" (seems comparable when you factor in the fixes to callers,
> > but has the advantage of being contained within VM subsystem)
> > "Horrible code" (I still don't see it. Of course the code will be nicer if we
> > don't fix the issue _at all_, but I don't see this is so much worse than having
> > to fix callers.)
>
> Honestly, I don't dislike your.
> but I really hope to fix this bug. if someone nak your patch, I'll seek another way.

Yes, I appreciate you looking at alternatives, and you haven't been strongly
arguing against my patch. So this comment was not aimed at you :)


> > FWIW, I have attached my patch again (with simple function-movement hunks
> > moved into another patch so it is easier to see real impact of this patch).
>
> OK. I try to test your patch too.

Well I split it out and it requires another patch to move functions around
(eg. zap_pte from fremap.c into memory.c). I just attached it here to
illustrate the core of my fix. If you would like to run any real tests, let
me know and I could send a proper rollup.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/