Re: [linux-next-20130422] Bug in SLAB?

From: Glauber Costa
Date: Mon Apr 29 2013 - 10:44:04 EST


On 04/29/2013 02:12 PM, Pekka Enberg wrote:
> On Mon, Apr 29, 2013 at 5:40 AM, Tetsuo Handa
> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>> Tetsuo Handa wrote:
>>> Also, kmalloc_index() in include/linux/slab.h can return 0 to 26.
>>>
>>> If (MAX_ORDER + PAGE_SHIFT - 1) > 25 is true and
>>> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
>>> happens), kmalloc_caches[26] is beyond the array, for kmalloc_caches[26]
>>> allows 0 to 25.
>>>
>>> If (MAX_ORDER + PAGE_SHIFT - 1) <= 25 is true and
>>> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
>>> happens), kmalloc_caches[26] is beyond the array, for
>>> kmalloc_caches[MAX_ORDER + PAGE_SHIFT] allows 0 to MAX_ORDER + PAGE_SHIFT - 1.
>>>
>>> Would you recheck that the array size is correct?
>>>
>>
>> I confirmed (on x86_32) that
>>
>> volatile unsigned int size = 8 * 1024 * 1024;
>> kmalloc(size, GFP_KERNEL);
>>
>> causes no warning at compile time and returns NULL at runtime. But
>>
>> unsigned int size = 8 * 1024 * 1024;
>> kmalloc(size, GFP_KERNEL);
>>
>> causes compile time warning
>>
>> include/linux/slab_def.h:136: warning: array subscript is above array bounds
>>
>> and runtime bug.
>>
>> BUG: unable to handle kernel NULL pointer dereference at 00000058
>> IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0
>>
>> I confirmed (on x86_32) that
>>
>> kmalloc(64 * 1024 * 1024, GFP_KERNEL);
>>
>> causes compile time warning
>>
>> include/linux/slab_def.h:136: warning: array subscript is above array bounds
>>
>> and runtime bug.
>>
>> Kernel BUG at c10b9c5b [verbose debug info unavailable]
>> invalid opcode: 0000 [#1] SMP
>>
>> Also,
>>
>> volatile unsigned int size = 64 * 1024 * 1024;
>> kmalloc(size, GFP_KERNEL);
>>
>> causes no warning at compile time but runtime bug.
>>
>> Kernel BUG at c10b9c5b [verbose debug info unavailable]
>> invalid opcode: 0000 [#1] SMP
>>
>> There are kernel modules which expect kmalloc() to return NULL rather than
>> oops when the requested size is too large.
>
> Christoph, Glauber, it seems like commit e3366016 ("slab: Use common
> kmalloc_index/kmalloc_size functions") is causing some problems here.
> Can you please take a look?
>
> Pekka
>
I believe this is because the code now always assume that the cache is
found when a constant is passed. Before this patch, we had a "found"
statement that was mistakenly removed.

If I am right, the following (untested) patch should solve the problem.