Re: [linus:master] [hugetlb] 7118fc2906: kernel_BUG_at_lib/list_debug.c

From: Mike Kravetz
Date: Tue Jan 17 2023 - 15:01:20 EST


On 01/17/23 15:10, kernel test robot wrote:
>
> +Vlastimil Babka, Hyeonggon Yoo, Feng Tang and Fengwei Yin
>
> Hi, Mike Kravetz,
>
> we reported
> "[linus:master] [mm, slub] 0af8489b02: kernel_BUG_at_include/linux/mm.h" [1]
>
> Vlastimil, Hyeonggon, Feng and Fengwei gave us a lot of great guidances based on
> it, and, perticularly, after enabling below config per Vlastimil's suggestion
> CONFIG_DEBUG_PAGEALLOC
> CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT
> CONFIG_SLUB_DEBUG
> CONFIG_SLUB_DEBUG_ON
> by more tests, we realized the "0af8489b02" is not the real culprit.
>
> the new bisection was triggered and finally it pointed to this "7118fc2906".
>
> though reporting for different issues
> ("kernel_BUG_at_include/linux/mm.h" for 0af8489b02 vs.
> "kernel_BUG_at_lib/list_debug.c" for this commit),
> Feng and Fengwei helped further to confirm they are similar.
> They will supply more technical wise analysis later.
>
> please be noted the issues are not always happening
> (~10% on this commit or 0af8489b02)

Nice work!

>From other replies in this thread, it does not appear the actual code change
made by this commit is the root cause. Rather, the change is triggering some
other bug ... perhaps in the compiler?

I will start looking at this. However, I suspect others have more skill and
experience in this type of debug.
--
Mike Kravetz