Re: 2.6.0-test7 DEBUG_PAGEALLOC oops

From: Manfred Spraul
Date: Sat Oct 11 2003 - 07:08:05 EST


Mike Galbraith wrote:


eax: 00000000 ebx: c7802f98 ecx: c0301390 edx: c030138c
esi: c0349ffe edi: 017e0008 ebp: c0349da6 esp: c0349d96
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c0348000 task=c02fcbe0)

The esp value is sane, the stack is at 0xc0348000, and the fault is at 'a000: just behind the end of the stack.

I'm blind. The esp value is the culprit:
It's not 32-bit aligned. Someone misaligned the stack, and thus
if(stack_ptr & (THREAD_SIZE-1))
didn't notice the end of the stack.
The generated assembly of store_slabinfo is correct:
1d2: f7 c6 ff 1f 00 00 test $0x1fff,%esi
Check sptr against THREAD_SIZE -1
1d8: 74 21 je 1fb <store_stackinfo+0x6f>
1da: 8b 3e mov (%esi),%edi
And load *sptr.


It looks like store stackinfo accesses memory behind the end of the stack.


Yeah, I'm trying to figure out why. The below (if dang mailer actually inlines it) kludge allows me to boot, so I suppose I need to ponder addr wrt _stext and _etext.

Wrong direction: Right now it crashes because it runs over the end of the stack.
With your patch applied, the allocated object is too small to hold all entries on the stack, and thus store_stackinfo aborts before it runs into the next page.

I'd increase kstack_depth_to_print to 140. Do not increase it too much, otherwise it will oops due to the misaligned stack.
Then check the EBP values: They are pushed after the return address. The return addresses are listed in the Call Trace section.
Example:
0xc01316aa8 pushes 0xc0349dd6 -> odd.
0xc0131b6c pushes 0xc0349de6 -> odd.

0xc0131b3e pushes c0349e02 -> odd.

Proper values for EBP are multiples of 4. One you find where the stack got misaligned, disassemble the offending function (or send me the .o file)


--
Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/