Re: [RFC][PATCH 00/16] sched: Core scheduling

From: Aubrey Li
Date: Thu Mar 14 2019 - 01:32:32 EST


On Thu, Mar 14, 2019 at 8:35 AM Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
> >>
> >> One more NULL pointer dereference:
> >>
> >> Mar 12 02:24:46 aubrey-ivb kernel: [ 201.916741] core sched enabled
> >> [ 201.950203] BUG: unable to handle kernel NULL pointer dereference
> >> at 0000000000000008
> >> [ 201.950254] ------------[ cut here ]------------
> >> [ 201.959045] #PF error: [normal kernel read fault]
> >> [ 201.964272] !se->on_rq
> >> [ 201.964287] WARNING: CPU: 22 PID: 2965 at kernel/sched/fair.c:6849
> >> set_next_buddy+0x52/0x70
> >
> Shouldn't the for_each_sched_entity(se) skip the code block for !se case
> have avoided null pointer access of se?
>
> Since
> #define for_each_sched_entity(se) \
> for (; se; se = se->parent)
>
> Scratching my head a bit here on how your changes would have made
> a difference.

This NULL pointer dereference is not replicable, which makes me thought the
change works...

>
> In your original log, I wonder if the !se->on_rq warning on CPU 22 is mixed with the actual OOPs?
> Saw also in your original log rb_insert_color. Wonder if that
> was actually the source of the Oops?

No chance to figure this out, I only saw this once, lockup occurs more
frequently.

Thanks,
-Aubrey