Re: [PATCH] perf/core: don't WARN for impossible rb sizes

From: Julien Thierry
Date: Fri Jan 11 2019 - 04:06:13 EST


Hi Mark,

On 10/01/2019 14:27, Mark Rutland wrote:
> The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
> large its ringbuffer mmap should be. This can be configured to arbitrary
> values, which can be larger than the maximum possible allocation from
> kmalloc.
>
> When this is configured to a suitably large value (e.g. thanks to the
> perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
> __alloc_pages_nodemask():
>
> [ 337.316688] WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511
> __alloc_pages_nodemask+0x3f8/0xbc8
> [ 337.316694] Modules linked in:
> [ 337.316704] CPU: 2 PID: 5666 Comm: perf Not tainted 5.0.0-rc1 #2669
> [ 337.316708] Hardware name: ARM Juno development board (r0) (DT)
> [ 337.316714] pstate: 20000005 (nzCv daif -PAN -UAO)
> [ 337.316720] pc : __alloc_pages_nodemask+0x3f8/0xbc8
> [ 337.316728] lr : alloc_pages_current+0x80/0xe8
> [ 337.316732] sp : ffff000016eeb9e0
> [ 337.316736] x29: ffff000016eeb9e0 x28: 0000000000080001
> [ 337.316744] x27: 0000000000000000 x26: ffff0000111e21f0
> [ 337.316751] x25: 0000000000000001 x24: 0000000000000000
> [ 337.316757] x23: 0000000000080001 x22: 0000000000000000
> [ 337.316762] x21: 0000000000000000 x20: 000000000000000b
> [ 337.316768] x19: 000000000060c0c0 x18: 0000000000000000
> [ 337.316773] x17: 0000000000000000 x16: 0000000000000000
> [ 337.316779] x15: 0000000000000000 x14: 0000000000000000
> [ 337.316784] x13: 0000000000000000 x12: 0000000000000000
> [ 337.316789] x11: 0000000000100000 x10: 0000000000000000
> [ 337.316795] x9 : 0000000010044400 x8 : 0000000080001000
> [ 337.316800] x7 : 0000000000000000 x6 : ffff800975584700
> [ 337.316806] x5 : 0000000000000000 x4 : ffff0000111cd6c8
> [ 337.316811] x3 : 0000000000000000 x2 : 0000000000000000
> [ 337.316816] x1 : 000000000000000b x0 : 000000000060c0c0
> [ 337.316822] Call trace:
> [ 337.316828] __alloc_pages_nodemask+0x3f8/0xbc8
> [ 337.316834] alloc_pages_current+0x80/0xe8
> [ 337.316841] kmalloc_order+0x14/0x30
> [ 337.316848] __kmalloc+0x1dc/0x240
> [ 337.316854] rb_alloc+0x3c/0x170
> [ 337.316860] perf_mmap+0x3bc/0x470
> [ 337.316867] mmap_region+0x374/0x4f8
> [ 337.316873] do_mmap+0x300/0x430
> [ 337.316878] vm_mmap_pgoff+0xe4/0x110
> [ 337.316884] ksys_mmap_pgoff+0xc0/0x230
> [ 337.316892] __arm64_sys_mmap+0x28/0x38
> [ 337.316899] el0_svc_common+0xb4/0x118
> [ 337.316905] el0_svc_handler+0x2c/0x80
> [ 337.316910] el0_svc+0x8/0xc
> [ 337.316915] ---[ end trace fa29167e20ef0c62 ]---
>
> Let's avoid this by checking that the requested allocation is possible
> before calling kzalloc.
>
> Reported-by: Julien Thierry <julien.thierry@xxxxxxx>
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> ---
> kernel/events/ring_buffer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> index 4a9937076331..309ef5a64af5 100644
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -734,6 +734,9 @@ struct ring_buffer *rb_alloc(int nr_pages, long watermark, int cpu, int flags)
> size = sizeof(struct ring_buffer);
> size += nr_pages * sizeof(void *);
>
> + if (order_base_2(size) >= MAX_ORDER)
> + goto fail;
> +

I see that in kernel/events/ring_buffer.c there are two versions of
rb_alloc() (depending on whether CONFIG_PERF_USE_VMALLOC is defined or not).

Since the warning comes from the kzalloc, I'd think we'd need to add
this check in both implementations of rb_alloc().


With that change (or if for some reason the other rb_alloc() version
doesn't need the check):

Reviewed-by: Julien Thierry <julien.thierry@xxxxxxx>

Thanks,

--
Julien Thierry