Re: MM deadlock [was: Re: arca-vm-8...]

Peter T. Breuer (ptb@it.uc3m.es)
Tue, 12 Jan 1999 00:59:18 +0100 (MET)


"A month of sundays ago Stephen C. Tweedie wrote:"
> > Is this only matter of nbd? If so, maybe the best solution is to start
> > claiming: "don't swap over nbd, don't mount localhost drives read
> > write". [It is bad, but it is probably better than polluting rest of
> > kernel with nbd workarounds...]
>
> No. Any other process which gets in the way of our IO and which blocks
> for memory allocation can cause the deadlock. That might be another
> process doing a file IO, locking a buffer and then allocating memory
> inside the scsi layers, for example. It is not limited to nbd, but
> nbd's networking use will probably make it particularly bad.

Very interesting. I am still seeking clues for why my backport of
Pavels nbd to 2.0.* locks up sometimes under 2.0.36, with an
"impossible" condition, but runs perfectly under 2.0.25 - even in a
kernel in which the nbd device is treated as pluggable. What I see
reminds me strongly of what you describe (still, I'm OK - I use 2.0.25
:).

> Linus, I've also realised that making semaphores recursive does not fix
> the inode deadlock. It only eliminates the single process case. We can
> still have two separate processes each writing to a separate mmaped()
> file deadlock. If each process starts a msync() on its own file and in

like ckraid --fix on /dev/nd0 and /dev/nd1, maybe :-). Or dd to a file
that's mirrored across the two.

> the process of that tries to sync one of the other process's pages via
> try_to_free_page, we get the deadlock back.

Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/