Re: [PATCH v5 13/14] locks: skip deadlock detection on FL_FILE_PVTlocks

From: J. Bruce Fields
Date: Tue Jan 14 2014 - 16:27:05 EST


On Tue, Jan 14, 2014 at 01:24:23PM -0800, Andy Lutomirski wrote:
> On Tue, Jan 14, 2014 at 1:19 PM, Frank Filz <ffilzlnx@xxxxxxxxxxxxxx> wrote:
> >> On Tue, Jan 14, 2014 at 12:29:17PM -0800, Andy Lutomirski wrote:
> >> > [cc: drh, who I suspect is responsible for the most widespread
> >> > userspace software that uses this stuff]
> >> >
> >> > On Tue, Jan 14, 2014 at 11:27 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx>
> >> wrote:
> >> > > On Thu, Jan 09, 2014 at 04:58:59PM -0800, Andy Lutomirski wrote:
> >> > >> On Thu, Jan 9, 2014 at 4:49 PM, Jeff Layton <jlayton@xxxxxxxxxx>
> >> wrote:
> >> > >> > On Thu, 09 Jan 2014 12:25:25 -0800 Andy Lutomirski
> >> > >> > <luto@xxxxxxxxxxxxxx> wrote:
> >> > >> >> When I think of deadlocks caused by r/w locks (which these are),
> >> > >> >> I think of two kinds. First is what the current code tries to
> >> > >> >> detect: two processes that are each waiting for each other. I
> >> > >> >> don't know whether POSIX enshrines the idea of detecting that,
> >> > >> >> but I wouldn't be surprised, considering how awful the old POSIX
> >> locks are.
> >> > > ...
> >> > >> >> The sensible kind of detectable deadlock involves just one lock,
> >> > >> >> and it happens when two processes both hold read locks and try
> >> > >> >> to upgrade to write locks. This should be efficiently
> >> > >> >> detectable and makes upgrading locks safe(r).
> >> > >
> >> > > This also involves two processes waiting on each other, and the
> >> > > current code should detect either case equally well.
> >> > >
> >> > > ...
> >> > >> For this kind of deadlock detection, nothing global is needed --
> >> > >> I'm only talking about detecting deadlocks due to two tasks
> >> > >> upgrading locks on the same file (with overlapping ranges) at the
> > same
> >> time.
> >> > >>
> >> > >> This is actually useful for SQL-like things. Imagine this scenario:
> >> > >>
> >> > >> Program 1:
> >> > >>
> >> > >> Open a file
> >> > >> BEGIN;
> >> > >> SELECT whatever; -- acquires a read lock
> >> > >>
> >> > >> Program 2:
> >> > >>
> >> > >> Open the same file
> >> > >> BEGIN;
> >> > >> SELECT whatever; -- acquires a read lock
> >> > >>
> >> > >> Program 1:
> >> > >> UPDATE something; -- upgrades to write
> >> > >>
> >> > >> Now program 1 is waiting for program 2 to release its lock. But if
> >> > >> program 2 tries to UPDATE, then it deadlocks. A friendly MySQL
> >> > >> implementation (which, sadly, does not include sqlite) will fail
> >> > >> the abort the transaction instead.
> >> > >
> >> > > And then I suppose you'd need to get an exclusive lock when you
> >> > > retry, to guarantee forward progress in the face of multiple
> >> > > processes retrying at once.
> >> >
> >> > I don't think so -- as long as deadlock detection is 100% reliable and
> >> > if you have writer priority,
> >>
> >> We don't have writer priority. Depending on how it worked I'm not
> >> convinced it would help. E.g. consider the above but with 3 processes:
> >>
> >> processes 1, 2, and 3 each get a whole-file read lock.
> >>
> >> process 1 requests a write lock, blocks because it conflicts
> >> with read locks held by 2 and 3.
> >>
> >> process 2 requests a write lock, gets -EDEADLK, unlocks and
> >> requests a new read lock. That request succeeds because there
> >> is no conflicting lock. (Note the lock manager had no
> >> opportunity to upgrade 1's lock here thanks to the conflict with
> >> 3's lock.)
> >
> > As I understand write lock priority, process 2 requesting a new read lock
> > would block, once there is a write lock waiter, no further read locks would
> > be granted that would conflict with that waiting write lock.
>
> ...which reminds me -- if anyone implements writer priority, please
> make it optional (either w/ a writer-priority-ignoring read lock or a
> non-priority-granting write lock). I have an application for which
> writer priority would be really annoying.

Is it something you could describe briefly?

--b.

>
> Even better: Have read-lock-and-wait-for-pending-writers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/