Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP

From: Jesper Juhl
Date: Thu Nov 23 2006 - 05:27:50 EST


On 22/11/06, Stephen Hemminger <shemminger@xxxxxxxx> wrote:
On Wed, 22 Nov 2006 13:58:11 +0100
"Jesper Juhl" <jesper.juhl@xxxxxxxxx> wrote:

> On 22/11/06, Jesper Juhl <jesper.juhl@xxxxxxxxx> wrote:
> > On 22/11/06, David Chinner <dgc@xxxxxxx> wrote:
> > > On Tue, Nov 21, 2006 at 11:02:23PM +0100, Jesper Juhl wrote:
> > > > On 21/11/06, David Chatterton <chatz@xxxxxxxxxxxxxxxxx> wrote:
> ...
> > > > >Thanks for traces, I've captured this information.
> > > > >
> > > > You are welcome. If you want/need more traces then I've got ~2.1G
> > > > worth of traces that you can have :)
> > >
> > > Well, we don't need that many, but it would be nice to have a
> > > set of unique traces that lead to overflows - could you process
> > > them in some way just to extract just the unique XFS traces that
> > > occur?
> > >
> > I'll try to extract a copy of each unique trace that involves xfs,
> > sometime tomorrow or the day after, and then send you the result.
> >
>
> Attached are two files. The one named stack_overflows.txt.gz contains
> one instance of each unique stack overflow + trace that I've got. The
> other file named kernel_BUG.txt.gz contains a few BUG() messages that
> were also in the logs.
>

You have a kind of worst case scenario there:
XFS + Block layer
TCP receive/transmit
VLAN

It is hard to know who to blame, there is no information about stack
level at each call. Since it doesn't show up for filesystems other than
XFS, I would pick on that. Perhaps the following:

Well, there's a very good explanation for that. The server has nothing
but XFS file systems (well, /boot is ext3, but that doesn't get used
for anything but the kernel image and System.map file).


--- 2.6.19-rc6.orig/arch/i386/Kconfig.debug 2006-11-22 11:59:32.000000000 -0800
+++ 2.6.19-rc6/arch/i386/Kconfig.debug 2006-11-22 12:00:28.000000000 -0800
@@ -58,7 +58,7 @@

config 4KSTACKS
bool "Use 4Kb for kernel stacks instead of 8Kb"
- depends on DEBUG_KERNEL
+ depends on DEBUG_KERNEL && !XFS_FS
help
If you say Y here the kernel will use a 4Kb stacksize for the
kernel stack attached to each process/thread. This facilitates

Using 8K stacks works arround the problem, so for now the above patch
may well make sense. But, it would be better to get XFS fixed rather
than start adding dependencies for 4KSTACKS - it might be troublesome
getting rid of it again.

--
Jesper Juhl <jesper.juhl@xxxxxxxxx>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/