Re: 2.6.1 IO lockup on SMP systems

From: Andrew Morton
Date: Sat Feb 21 2004 - 14:32:49 EST


"Sergey S. Kostyliov" <rathamahata@xxxxxxx> wrote:
>
> Hello Andrew,
>
> On Sunday 01 February 2004 03:17, Andrew Morton wrote:
> > "Sergey S. Kostyliov" <rathamahata@xxxxxxx> wrote:
> > >
> > > I had experienced a lockups on three of my servers with 2.6.1. It doesn't
> > > look like a deadlock, the box is still pingable and all tcp ports which were
> > > in listen state before a lockup are remains in listen state, but I can't get
> > > any data from this ports. According to sar(1) systems had not been overloaded
> > > right before a lockup. And there is no log entries in all user services logs
> > > for almost 10 hours after lockup.
> >
> > Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks
> > backtrace or a locked machine with sysrq-T or `echo t >
> > /proc/sysrq-trigger'. Then send us the resulting trace.
>
> I've just reproduced this lockup with 2.6.3.
>
> >
> > You may need a serial console to be able to capture all the output.
> >
> > Also, it would be useful to know what sort of load the machines are under,
> > and what filesystems are in use.
>
> The machine is a http server. The main applications are:
> 1) apache 1.3 which serves php pages (mod_php):
> 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
> 54 requests currently being processed, 19 idle servers
> 2) mysql:
> Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
> Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
>
> This is an IO bound machine in general. All filesystems are reiserfs.
>
> Here is a sysrq-T output obtained from a locked box via serail console:

OK, so everything is stuck trying to allocate memory. Perhaps you ran out
of swapspace, or some process has gone berzerk allocating memory.

How much memory does the machine have, and how much swap space?

I suggest that you run a `vmstat 30' trace on a terminal somewhere, see what
it says prior to the hangs. Also capture the sysrq-M output after it has
hung.

It would be useful to monitor the contents of /proc/vmstat also.

And perhaps keep top running in `sort by memory usage' mode.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/