Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1

From: Paul Jackson
Date: Thu Jul 29 2004 - 01:45:20 EST


I just hit what might be the same oops.

I had not upgraded my working kernel for a month, and just now, when I
upgraded to 2.6.8-rc2-mm1, running sn2_defconfig on a small SN2 system,
it fails to boot everytime, ending with an Oops that starts out with:

======================================================
Freeing unused kernel memory: 320kB freed
Unable to handle kernel NULL pointer dereference (address 0000000000000008)
swapper[0]: Oops 8813272891392 [1]
Modules linked in:

Pid: 0, CPU 0, comm: swapper
psr : 0000101008022018 ifs : 8000000000000e20 ip : [<a0000001000bd710>] Not tainted
ip is at find_busiest_group+0xb0/0x640
======================================================

I added a conditional printk_ratelimit'ed print at the top of
find_busiest_group() whenever group is NULL, just before the first
dereference of group in the line:

local_group = cpu_isset(this_cpu, group->cpumask);

That print fires about 20,480 times each 5 second suppression window.

But it boots, if I also add code to break out of the "do { ... } while
(group != sd->groups)" loop, whenever group goes NULL.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.650.933.1373
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/