Re: Random freeze (Re: mmotm 2008-11-19-02-19 uploaded)

From: Valdis . Kletnieks
Date: Fri Nov 21 2008 - 00:23:38 EST


On Thu, 20 Nov 2008 15:20:54 PST, Andrew Morton said:

> The traditional cause of the above trace is that someone mucked up the
> block/driver/irq-routing layer and we lost an IO completion.

Yes, that would explain all the symptoms and tracebacks - everybody comes
to a screeching halt the next time they try to go to disk, while the actual
disk drive is showing zero activity.

> It's also of course possible (but less common) that someone mucked up
> the VFS. It would be interesting to revert
> do_mpage_readpage-dont-submit-lots-of-small-bios-on-boundary.patch.

I'm seeing an MTBF of about 2-3 hours when actually applying an I/O load to the
system. I'll try reverting that patch, and if it survives an entire day or
two it will be pretty strong circumstantial evidence that patch is the culprit...

Attachment: pgp00000.pgp
Description: PGP signature