Re: linux-next boot error: WARNING in kmem_cache_free

From: Eric Biggers
Date: Sat Jun 27 2020 - 19:10:26 EST


[+Cc linux-mm; +Bcc linux-fsdevel]

On Mon, Jun 22, 2020 at 03:28:09AM -0400, Qian Cai wrote:
>
>
> > On Jun 22, 2020, at 2:42 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > There is a reason, it's still important for us.
> > But also it's not our strategy to deal with bugs by not testing
> > configurations and closing eyes on bugs, right? If it's an official
> > config in the kernel, it needs to be tested. If SLAB is in the state
> > that we don't care about any bugs in it, then we need to drop it. It
> > will automatically remove it from all testing systems out there. Or at
> > least make it "depends on BROKEN" to slowly phase it out during
> > several releases.
>
> Do you mind sharing whatâs your use cases with CONFIG_SLAB? The only thing prevents it from being purged early is that it might perform better with a certain type of networking workloads where syzbot should have nothing to gain from it.
>
> I am more of thinking about the testing coverage that we could use for syzbot to test SLUB instead of SLAB. Also, I have no objection for syzbot to test SLAB, but then from my experience, you are probably on your own to debug further with those testing failures. Until you are able to figure out the buggy patch or patchset introduced the regression, I am afraid not many people would be able to spend much time on SLAB. The developers are pretty much already half-hearted on it by only fixing SLAB here and there without runtime testing it.
>

This bug also got reported 2 days later by the kernel test robot
(https://lore.kernel.org/lkml/20200623090213.GW5535@shao2-debian/).
Then it was fixed by commit 437edcaafbe3, so telling syzbot:

#syz fix: mm, slab/slub: improve error reporting and overhead of cache_from_obj()-fix

If CONFIG_SLAB is no longer useful and supported then it needs to be removed
from the kernel. Otherwise, it needs to be tested just like all other options.

- Eric