Re: Killing POSIX deadlock detection

From: Trond Myklebust
Date: Sun Jun 06 2004 - 15:53:51 EST


På su , 06/06/2004 klokka 16:09, skreiv Eric W. Biederman:
> Trond Myklebust <trond.myklebust@xxxxxxxxxx> writes:
>
> > På su , 06/06/2004 klokka 09:27, skreiv Matthew Wilcox:
> > \
> > > > T1 locks file F1 -> lock (P1, F1)
> > > > P2 locks file F2 -> lock (P2, F2)
> > > > P2 locks file F1 -> blocks against (P1, F1)
> > > > T1 locks file F2 -> blocks against (P2, F2)
> > >
> > > Less contrived example -- T2 locks file F2. We report deadlock here too,
> > > even though T1 is about to unlock file F1.
>
> There is a fairly sane linux specific definition here. We should
> track these things not by pid or tid, but by struct files_struct.

RTFC... Look carefully in fs/locks.c at stuff like posix_same_owner().
We currently use both the tgid and the struct files_struct (although
there are a few notable bugs where we only check the one or the
other)...

That is, however, a definition which breaks the SUS standards, and it
therefore ends up introducing pathologies such as the steal_locks crap.
struct files_struct is NOT a sane basis for tracking locks.

> > Yes: As Chuck points out, that is a fairly nasty change of the userland
> > API.
>
> ???? Failing to detect a deadlock is not a change in the API.
> It is simply a change in behavior.

It is a change in functionality from one where potential deadlocks are
detected and reported as errors to one where deadlocks are suddenly
possible. Are you saying that functionality is not a part of the API?


> Perhaps what we should do is simply not attempt to detect deadlocks
> involving threaded processes.

So how do you define (and detect) a threaded process?

Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/