Re: [6.2.0-rc7] BUG: KASAN: slab-out-of-bounds in hop_cmp+0x26/0x110

From: Yury Norov
Date: Thu Feb 16 2023 - 11:34:09 EST


On Wed, Feb 15, 2023 at 05:51:39PM -0800, Kees Cook wrote:
> On February 15, 2023 4:31:32 PM PST, Yury Norov <yury.norov@xxxxxxxxx> wrote:
> >+ Kees Cook <keescook@xxxxxxxxxxxx>
> >+ Miguel Ojeda <ojeda@xxxxxxxxxx>
> >+ Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
> >
> >On Wed, Feb 15, 2023 at 09:24:52AM +0100, Bruno Goncalves wrote:
> >> On Tue, 14 Feb 2023 at 15:32, Yury Norov <yury.norov@xxxxxxxxx> wrote:
> >> >
> >> > On Tue, Feb 14, 2023 at 02:23:06PM +0100, Bruno Goncalves wrote:
> >> > > Hello,
> >> > >
> >> > > recently when testing kernel with debug options set from net-next [1]
> >> > > and bpf-next [2] the following call trace happens:
> >> > >
> >> > Hi Bruno,
> >> >
> >> > Thanks for report.
> >> >
> >> > This looks weird, because the hop_cmp() spent for 3 month in -next till
> >> > now. Anyways, can you please share your NUMA configuration so I'll try
> >> > to reproduce the bug locally? What 'numactl -H' outputs?
> >> >
> >>
> >> Here is the output:
> >>
> >> numactl -H
> >> available: 4 nodes (0-3)
> >> node 0 cpus: 0 1 2 3 4 5 6 7 32 33 34 35 36 37 38 39
> >> node 0 size: 32063 MB
> >> node 0 free: 31610 MB
> >> node 1 cpus: 8 9 10 11 12 13 14 15 40 41 42 43 44 45 46 47
> >> node 1 size: 32248 MB
> >> node 1 free: 31909 MB
> >> node 2 cpus: 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55
> >> node 2 size: 32248 MB
> >> node 2 free: 31551 MB
> >> node 3 cpus: 24 25 26 27 28 29 30 31 56 57 58 59 60 61 62 63
> >> node 3 size: 32239 MB
> >> node 3 free: 31468 MB
> >> node distances:
> >> node 0 1 2 3
> >> 0: 10 21 31 21
> >> 1: 21 10 21 31
> >> 2: 31 21 10 21
> >> 3: 21 31 21 10
> >>
> >> Bruno
> >
> >So, I was able to reproduce it, and it seems like a compiler issue.
> >
> >The problem is that hop_cmp() calculates pointer to a previous hop
> >object unconditionally at the beginning of the function:
> >
> > struct cpumask **prev_hop = *((struct cpumask ***)b - 1);
> >
> >Obviously, for the first hop, there's no such thing like a previous
> >one, and later in the code 'prev_hop' is used conditionally on that:
> >
> > k->w = (b == k->masks) ? 0 : cpumask_weight_and(k->cpus, prev_hop[k->node]);
> >
> >To me the code above looks like it instructs the compiler to dereference
> >'b - 1' only if b != k->masks, i.e. when b is not the first hop. But GCC
> >does that unconditionally, which looks wrong.
> >
> >If I defer dereferencing manually like in the snippet below, the kasan
> >warning goes away.
> >
> >diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> >index 48838a05c008..5f297f81c574 100644
> >--- a/kernel/sched/topology.c
> >+++ b/kernel/sched/topology.c
> >@@ -2081,14 +2081,14 @@ struct __cmp_key {
> >
> > static int hop_cmp(const void *a, const void *b)
> > {
> >- struct cpumask **prev_hop = *((struct cpumask ***)b - 1);
> > struct cpumask **cur_hop = *(struct cpumask ***)b;
> > struct __cmp_key *k = (struct __cmp_key *)a;
> >
> > if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
> > return 1;
> >
> >- k->w = (b == k->masks) ? 0 : cpumask_weight_and(k->cpus, prev_hop[k->node]);
> >+ k->w = (b == k->masks) ? 0 :
> >+ cpumask_weight_and(k->cpus, (*((struct cpumask ***)b - 1))[k->node]);
> > if (k->w <= k->cpu)
> > return 0;
> >
> >I don't understand why GCC doesn't optimize out unneeded dereferencing.
> >It does that even if I replace ternary operator with if-else construction.
> >To me it looks like a compiler bug.
> >
> >However, I acknowledge that I'm not a great expert in C standard, so
> >it's quite possible that there may be some rule that prevents from
> >doing such optimizations, even for non-volatile variables.
> >
> >Adding compiler people. Guys, could you please clarify on that?
> >If it's my fault, I'll submit fix shortly.
>
> My understanding is that without getting inlined, the compiler cannot evaluate "b == k->masks" at compile time (if it can at all). So since the dereference is part of variable initialization, it's not executed later: it's executed at function entry.
>
> Regardless, this whole function looks kind of hard to read. Why not fully expand it with the if/else logic and put any needed variables into the respective clauses? Then humans can read it and the compiler will optimize it down just as efficiently.

Yes, changing it to if-else would be a simplest solution.

Thanks,
Yury