Re: [PATCH v3 1/3] mm: kmemleak: Make the tool tolerant to struct scan_area allocation failures

From: Catalin Marinas
Date: Mon Oct 07 2019 - 05:57:03 EST


On Sat, Oct 05, 2019 at 01:08:43PM +1000, Alexey Kardashevskiy wrote:
> On 03/10/2019 18:41, Catalin Marinas wrote:
> > On Thu, Oct 03, 2019 at 04:13:07PM +1000, Alexey Kardashevskiy wrote:
> >> On 13/08/2019 02:06, Catalin Marinas wrote:
> >>> Object scan areas are an optimisation aimed to decrease the false
> >>> positives and slightly improve the scanning time of large objects known
> >>> to only have a few specific pointers. If a struct scan_area fails to
> >>> allocate, kmemleak can still function normally by scanning the full
> >>> object.
> >>>
> >>> Introduce an OBJECT_FULL_SCAN flag and mark objects as such when
> >>> scan_area allocation fails.
> >>
> >> I came across this one while bisecting sudden drop in throughput of a
> >> 100Gbit Mellanox CX4 ethernet card in a PPC POWER9 system, the speed
> >> dropped from 100Gbit to about 40Gbit. Bisect pointed at dba82d943177,
> >> this are the relevant config options:
> >>
> >> [fstn1-p1 kernel]$ grep KMEMLEAK ~/pbuild/kernel-le-4g/.config
> >> CONFIG_HAVE_DEBUG_KMEMLEAK=y
> >> CONFIG_DEBUG_KMEMLEAK=y
> >> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=16000
> >> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
> >> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
> >> CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
> >
> > The throughput drop is probably caused caused by kmemleak slowing down
> > all memory allocations (including skb). So that's expected. You may get
> > similar drop with other debug options like lock proving, kasan.
>
> I was not precise. I meant that before dba82d943177 kmemleak would
> work but would not slow network down (at least 100Gbit) and now it
> does which is downgrade so I was wondering if kmemleak just got so
> much better to justify this change or there is a bug somewhere, so
> which one is it? Or "LOG_SIZE=400" never really worked?

I suspect LOG_SIZE=400 never worked on your system (you can check the
log for any kmemleak disabled messages).

Thanks for testing the patch.

--
Catalin