Re: A unresponsive file system can hang all I/O in the system onlinux-2.6.23-rc6 (dirty_thresh problem?)

From: Peter Zijlstra
Date: Fri Sep 28 2007 - 09:35:59 EST

On Fri, 2007-09-28 at 07:28 -0600, Jonathan Corbet wrote:
> Andrew wrote:
> > It's unrelated to the actual value of dirty_thresh: if the machine fills up
> > with dirty (or unstable) NFS pages then eventually new writers will block
> > until that condition clears.
> >
> > 2.4 doesn't have this problem at low levels of dirty data because 2.4
> > VFS/MM doesn't account for NFS pages at all.
> Is it really NFS-related? I was trying to back up my 2.6.23-rc8 system
> to an external USB drive the other day when something flaked and the
> drive fell off the bus. That, too, was sufficient to wedge the entire
> system, even though the only thing which needed the dead drive was one
> rsync process. It's kind of a bummer to have to hit the reset button
> after the failure of (what should be) a non-critical piece of hardware.
> Not that I have a fix to propose...:)

the per bdi work in -mm should make the system not drop dead.

Still, would a remove,re-insert of the usb media end up with the same
bdi? That is, would it recognise as the same and resume the transfer.

Anyway, it would be grand (and dangerous) if we could provide for a
button that would just kill off all outstanding pages against a dead

Attachment: signature.asc
Description: This is a digitally signed message part