Re: [RFC PATCH] cifs: Fix possible deadlock with cifs and work queues

From: Jeff Layton
Date: Fri Mar 21 2014 - 08:59:45 EST


On Fri, 21 Mar 2014 12:32:12 +0400
Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:

> 2014-03-21 6:23 GMT+04:00 Steven Rostedt <rostedt@xxxxxxxxxxx>:
> > On Thu, 20 Mar 2014 17:02:39 -0400
> > Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> >
> >> Eventually the server should just allow the read to complete even if
> >> the client doesn't respond to the oplock break. It has to since clients
> >> can suddenly drop off the net while holding an oplock. That should
> >> allow everything to unwedge eventually (though it may take a while).
> >>
> >> If that's not happening then I'd be curious as to why...
> >
> > The problem is that the data is being filled in the page and the reader
> > is waiting for the page lock to be released. The kworker for the reader
> > will issue the complete() and unlock the page to wake up the reader.
> >
> > But because the other workqueue callback calls down_read(), and there
> > can be a down_write() waiting for the reader to finish, this
> > down_read() will block on the lock as well (rwsems are fair locks).
> > This blocks the other workqueue callback from issuing the complete and
> > page_unlock() that will wake up the reader that is holding the rwsem
> > with down_read().
> >
> > DEADLOCK.
>
> Thank you for reporting and clarifying the issue!
>
> Read and write codepaths both obtain lock_sem for read and then wait
> for cifsiod_wq to complete and release lock_sem. They don't do any
> lock_sem operations inside their work task queued to cifsiod_wq. But
> oplock code can obtain/release lock_sem in its work task. So, that's
> why I agree with Jeff and suggest to move the oplock code to a
> different work queue (cifsioopd_wq?) but leave read and write
> codepaths use cifsiod_wq.
>

Yes, thanks for the clarification. I missed the fact that a 3rd task
was blocked on a down_write. I think that fix will help prevent the
full-blown deadlock, but it'll still be susceptible to long stalls in
some situations.

In Steven's example the work on CPU0 is blocked on a SMB_READ. That
read may not be completing because the server has issued an oplock
break to the client and is waiting on the response. That oplock break
response is blocked because it's blocked on the down_write.

In that situation, what breaks the deadlock is that the server
eventually gives up waiting on the oplock break response, but that can
take on the order of a minute.

So yeah, I think we should add a dedicated workqueue for oplock breaks
as an interim fix, but I also think we need to consider revamping the
lock pushing code such that oplock breaks are not subject to being
blocked like that.

--
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/