Re: [TIP] BUG kmalloc-4096: Poison overwritten (ath5k_rx_skb_alloc)

From: Bob Copeland
Date: Sat Mar 07 2009 - 22:09:49 EST


On Fri, Mar 06, 2009 at 09:42:49AM +0000, Sitsofe Wheeler wrote:
> > parallel with iwlist wlan0 scan (as root, so scans are actually
> > performed), in parallel with iperf or ping. I didn't personally have
> > luck with that workload, though.

So I looked through this log for a few hours today. Sorry to say
that I don't have any answers, but here's a summary of what I saw:

- it didn't seem like there were any obvious race conditions at play;
that is, I didn't see other ath5k_XXX functions being pre-empted
by ath5k_intr, followed by the softirq.

- there were a few errors prior to catching the poison. I don't think
the trace contains enough info to say whether they were phy errors,
unsupported jumbo type errors, or whatever. Anyway there didn't seem
to be any obvious causal pattern, nothing like an error on the
40th previous buffer or a cascading series of errors. Of course,
the error could have happened much earlier compared to when the skbuff
in the freelist got reused.

At this point, I guess the best way forward is to have a special debug
patch for when we pass an skb up the stack, when it gets allocated, and
what is in the descriptors.

Jiri, I really think we should implement that better check for the
self linked descriptor using the rxdp register. bf_last is no longer a
valid marker for the self-linked descriptor at the end of the loop since
we re-add the just-processed descriptor every time through the loop
(or am I missing something?)... If you want I'll cook up a patch for
that too.

--
Bob Copeland %% www.bobcopeland.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/