Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten

From: Vegard Nossum
Date: Fri Jul 18 2008 - 03:04:22 EST

On Fri, Jul 18, 2008 at 4:03 AM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>> On Thu, Jul 17, 2008 at 11:42 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
>> >
>> > A regression to v2.6.26:
>> >
>> > I started getting this skb-head corruption message today, on a T60
>> > laptop with e1000:
>> >
>> > PM: Removing info for No Bus:vcs11
>> > device: 'vcs11': device_create_release
>> > =============================================================================
>> > BUG skbuff_head_cache: Poison overwritten
>> > -----------------------------------------------------------------------------
>> >
>> > INFO: 0xf658ae9c-0xf658ae9c. First byte 0x6a instead of 0x6b
>> 1. Notice the range. It's just a single byte.
>> 2. Notice the value. It's just a ++.
> It's supposed to be 0x6b, this would be a "--"

You're right! Oops. In my defence, I wrote that at 2 AM last night ;-)

> Also it (more likely IMHO) could be clearing a flag with the value 0x01.

It could be. But like I said in a later e-mail, the thing is likely
sk_buff->truesize. Which is not a flags variable. It _is_ however, a
counter, which is frequently -= and atomic_sub()ed.

That field is also an int, not a byte like I suggested above. This is
fine, though. "--" on an int can of course legitimately update/change
just the lower byte of an int.

But.. it could also be some random corruption coming from elsewhere.
Maybe even bad RAM (it's just a single bit anyway). But that's less


"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
