Re: [PATCH] lib/genalloc.c: Avoid de-referencing NULL pool

From: Andrew Morton
Date: Mon Jun 10 2019 - 17:56:09 EST


On Fri, 7 Jun 2019 16:43:31 -0700 Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:

> With architectures allowing the kernel to be placed almost arbitrarily
> in memory (e.g.: ARM64), it is possible to have the kernel resides at
> physical addresses above 4GB, resulting in neither the default CMA area,
> nor the atomic pool from successfully allocating. This does not prevent
> specific peripherals from working though, one example is XHCI, which
> still operates correctly.
>
> Trouble comes when the XHCI driver gets suspended and resumed, since we
> can now trigger the following NPD:
>
> ...
>
> [ 13.327884] f8c0: 0000000000000030 ffffffffffffffff
> [ 13.332835] [<ffffff80083c0df8>] addr_in_gen_pool+0x4/0x48
> [ 13.338398] [<ffffff80086004d0>] xhci_mem_cleanup+0xc8/0x51c
> [ 13.344137] [<ffffff80085f9250>] xhci_resume+0x308/0x65c
> [ 13.349524] [<ffffff80085e3de8>] xhci_brcm_resume+0x84/0x8c
> [ 13.355174] [<ffffff80084ad040>] platform_pm_resume+0x3c/0x64
> [ 13.360997] [<ffffff80084b91b4>] dpm_run_callback+0x5c/0x15c
> [ 13.366732] [<ffffff80084b96bc>] device_resume+0xc0/0x190
> [ 13.372205] [<ffffff80084baa70>] dpm_resume+0x144/0x2cc
> [ 13.377504] [<ffffff80084bafbc>] dpm_resume_end+0x20/0x34
> [ 13.382980] [<ffffff80080e0d88>] suspend_devices_and_enter+0x104/0x704
> [ 13.389585] [<ffffff80080e16a8>] pm_suspend+0x320/0x53c
> [ 13.394881] [<ffffff80080dfd08>] state_store+0xbc/0xe0
> [ 13.400094] [<ffffff80083a89d4>] kobj_attr_store+0x14/0x24
> [ 13.405655] [<ffffff800822a614>] sysfs_kf_write+0x60/0x70
> [ 13.411128] [<ffffff80082295d4>] kernfs_fop_write+0x130/0x194
> [ 13.416954] [<ffffff80081b5d10>] __vfs_write+0x60/0x150
> [ 13.422254] [<ffffff80081b6b20>] vfs_write+0xc8/0x164
> [ 13.427376] [<ffffff80081b7dd8>] SyS_write+0x70/0xc8
> [ 13.432412] [<ffffff8008083180>] el0_svc_naked+0x34/0x38
> [ 13.437800] Code: 92800173 97f6fb9e 17fffff5 d1000442 (f8408c03)
> [ 13.444033] ---[ end trace 2effe12f909ce205 ]---
>
> The call path leading to this problem is xhci_mem_cleanup() ->
> dma_free_coherent() -> dma_free_from_pool() -> addr_in_gen_pool. If the
> atomic_pool is NULL, we can't possibly have the address in the atomic
> pool anyway, so guard against that.
>

Arguably the caller shouldn't be pasing in a NULL pointer. Perhaps we
couild do this as a convenience thing if addr_in_gen_pool(NULL) makes
some sort of semantic sense, but I'm having trouble convincing myself
that it does.

So I'm somewhat inclined to think that going oops was the appropriate
response to this input...

> --- a/lib/genalloc.c
> +++ b/lib/genalloc.c
> @@ -439,6 +439,9 @@ bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
> unsigned long end = start + size - 1;
> struct gen_pool_chunk *chunk;
>
> + if (unlikely(!pool))
> + return found;

I think it would be clearer to use "return false" here, so the reader
doesn't have to go find and out what value `found' has.

> rcu_read_lock();
> list_for_each_entry_rcu(chunk, &(pool)->chunks, next_chunk) {
> if (start >= chunk->start_addr && start <= chunk->end_addr) {