Re: ODEBUG: Out of memory. ODEBUG disabled

From: Qian Cai
Date: Mon Nov 12 2018 - 23:34:00 EST




> On Nov 10, 2018, at 9:11 AM, Qian Cai <cai@xxxxxx> wrote:
>
> On 11/10/18 at 8:59 AM, Waiman Long wrote:
>
>> On 11/09/2018 08:45 PM, Qian Cai wrote:
>>>> Sent: Friday, November 09, 2018 at 5:08 PM
>>>> From: "Waiman Long" <longman@xxxxxxxxxx>
>>>> To: "Qian Cai" <cai@xxxxxx>, "Yang Shi" <yang.shi@xxxxxxxxxxxxxxxxx>
>>>> Cc: "open list" <linux-kernel@xxxxxxxxxxxxxxx>, "Thomas Gleixner" <tglx@xxxxxxxxxxxxx>, "Arnd Bergmann" <arnd@xxxxxxxx>, "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>, "Zhong Jiang" <zhongjiang@xxxxxxxxxx>
>>>> Subject: Re: ODEBUG: Out of memory. ODEBUG disabled
>>>>
>>>> On 11/09/2018 04:51 PM, Qian Cai wrote:
>>>>>> On Nov 9, 2018, at 4:42 PM, Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/9/18 1:36 PM, Qian Cai wrote:
>>>>>>> It is a bit annoying on this aarch64 server with 64 CPUs that is
>>>>>>> booting the latest mainline (3541833fd1f2) causes object debugging
>>>>>>> always running out of memory.
>>>>>> May you please paste the detail failure log?
>>>>> I assume you mean dmesg.
>>>>>
>>>>> Here is the dmesg for 64 CPUs,
>>>>> https://paste.ubuntu.com/p/BnhvXXhn7k/
>>>>>>> I have to boot the kernel with only 16 CPUs instead (nr_cpus=16)
>>>>>>> to make it work. Is it expected that object debugging is not going
>>>>>>> to work with large machines?
>>>>>> I don't think so. I'm supposed it works well with large CPU number on x86.
>>>>> Here is the one with nr_cpus workaround,
>>>>> https://paste.ubuntu.com/p/qMpd2CCPSV/
>>>> The debugobjects code have a set of 1024 statically allocated debug
>>>> objects that can be used in early boot before the slab memory allocator
>>>> is initialized. Apparently, the system may have used up all the
>>>> statically allocated objects. Try double ODEBUG_POOL_SIZE to see if it
>>>> helps.
>>> Great, you are right. Doubling the size makes it work. Does it make sense
>>> to have a kconfig option instead?
>>
>> First, I think you need to figure out what your system needed to use up
>> so many debug objects in early boot. If there is a legitimate reason for
>> this behavior, we can talk about having a kconfig option to increase that.
> Anybody else not getting ODEBUG OOM with more than 64-CPU? As
> mentioned, restricting to 16-CPU works fine. How can I figure out why the
> system uses so much debug objects?
On another aarch64 server with 256-CPU, even double the size of
ODEBUG_POOL_SIZE, i.e., 2048 will get "ODEBUG: Out of memory. ODEBUG
disabledâ.