Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group

From: Phil Auld
Date: Thu Aug 15 2019 - 09:39:33 EST


On Fri, Aug 09, 2019 at 06:43:09PM +0100 Valentin Schneider wrote:
> On 09/08/2019 14:33, Phil Auld wrote:
> > On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> >> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> >>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> >>
> >> ISTR there were more issues; but it sure is good to start picking them
> >> off.
> >>
> >
> > Following up on this I hit another in rt.c which looks like:
> >
> > [ 156.348854] Call Trace:
> > [ 156.351301] <IRQ>
> > [ 156.353322] sched_rt_period_timer+0x124/0x350
> > [ 156.357766] ? sched_rt_rq_enqueue+0x90/0x90
> > [ 156.362037] __hrtimer_run_queues+0xfb/0x270
> > [ 156.366303] hrtimer_interrupt+0x122/0x270
> > [ 156.370403] smp_apic_timer_interrupt+0x6a/0x140
> > [ 156.375022] apic_timer_interrupt+0xf/0x20
> > [ 156.379119] </IRQ>
> >
> > It looks like the same issue of not using the rq_lock* wrappers and
> > hence not using the pinning. From looking at the code there is at
> > least one potential hit in deadline.c in the push_dl_task path with
> > find_lock_later_rq but I have not hit that in practice.
> >
> > This commit, which introduced the warning, seems to imply that the use
> > of the rq_lock* wrappers is required, at least for any sections that will
> > call update_rq_clock:
> >
> > commit 26ae58d23b94a075ae724fd18783a3773131cfbc
> > Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Date: Mon Oct 3 16:53:49 2016 +0200
> >
> > sched/core: Add WARNING for multiple update_rq_clock() calls
> >
> > Now that we have no missing calls, add a warning to find multiple
> > calls.
> >
> > By having only a single update_rq_clock() call per rq-lock section,
> > the section appears 'atomic' wrt time.
> >
> >
> > Is that the case? Otherwise we have these false positives.
> >
>
> Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any
> update_rq_clock() that isn't preceded by that function will still have
> RQCF_UPDATED set the second time it's executed and will trigger the warn.
>
> Seeing as the wrappers boil down to raw_spin_*() when the debug bits are
> disabled, I don't see why we wouldn't want to convert these callsites.
>

The one above is easy enough. After that I hit one related to the double_rq_lock
paths. Now I see why that was not cleaned up already. That's going to be a
bit messier and will require some study.

I'll post this trivial anyway.

Cheers,
Phil

--