Re: [PATCH v2 10/10] kernel: might_fault does not imply might_sleep

From: Peter Zijlstra
Date: Tue May 21 2013 - 08:39:19 EST


On Sun, May 19, 2013 at 12:35:26PM +0300, Michael S. Tsirkin wrote:
> On Thu, May 16, 2013 at 08:40:41PM +0200, Peter Zijlstra wrote:
> > On Thu, May 16, 2013 at 02:16:10PM +0300, Michael S. Tsirkin wrote:
> > > There are several ways to make sure might_fault
> > > calling function does not sleep.
> > > One is to use it on kernel or otherwise locked memory - apparently
> > > nfs/sunrpc does this. As noted by Ingo, this is handled by the
> > > migh_fault() implementation in mm/memory.c but not the one in
> > > linux/kernel.h so in the current code might_fault() schedules
> > > differently depending on CONFIG_PROVE_LOCKING, which is an undesired
> > > semantical side effect.
> > >
> > > Another is to call pagefault_disable: in this case the page fault
> > > handler will go to fixups processing and we get an error instead of
> > > sleeping, so the might_sleep annotation is a false positive.
> > > vhost driver wants to do this now in order to reuse socket ops
> > > under a spinlock (and fall back on slower thread handler
> > > on error).
> >
> > Are you using the assumption that spin_lock() implies preempt_disable() implies
> > pagefault_disable()? Note that this assumption isn't valid for -rt where the
> > spinlock becomes preemptible but we'll not disable pagefaults.
>
> No, I was not assuming that. What I'm trying to say is that a caller
> that does something like this under a spinlock:
> preempt_disable
> pagefault_disable
> error = copy_to_user
> pagefault_enable
> preempt_enable_no_resched
>
> is not doing anything wrong and should not get a warning,
> as long as error is handled correctly later.
> Right?

Aside from the no_resched() thing which Steven already explained and my
previous email asking why you need the preempt_disable() at all, that
should indeed work.

The reason I was asking was that I wasn't sure you weren't doing:

spin_lock(&my_lock);
error = copy_to_user();
spin_unlock(&my_lock);

and expecting the copy_to_user() to always take the exception table
route. This works on mainline (since spin_lock implies a preempt disable
and preempt_disable is the same as pagefault_disable). However as should
be clear by now, it doesn't quite work that way for -rt.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/