Re: [RFC PATCH 47/86] rcu: select PREEMPT_RCU if PREEMPT

From: Paul E. McKenney
Date: Mon Dec 04 2023 - 20:03:59 EST


On Tue, Nov 28, 2023 at 10:30:53AM -0800, Ankur Arora wrote:
>
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
>
> > Paul!
> >
> > On Tue, Nov 21 2023 at 07:19, Paul E. McKenney wrote:
> >> On Tue, Nov 21, 2023 at 10:00:59AM -0500, Steven Rostedt wrote:
> >>> Right now, the use of cond_resched() is basically a whack-a-mole game where
> >>> we need to whack all the mole loops with the cond_resched() hammer. As
> >>> Thomas said, this is backwards. It makes more sense to just not preempt in
> >>> areas that can cause pain (like holding a mutex or in an RCU critical
> >>> section), but still have the general kernel be fully preemptable.
> >>
> >> Which is quite true, but that whack-a-mole game can be ended without
> >> getting rid of build-time selection of the preemption model. Also,
> >> that whack-a-mole game can be ended without eliminating all calls to
> >> cond_resched().
> >
> > Which calls to cond_resched() should not be eliminated?
> >
> > They all suck and keeping some of them is just counterproductive as
> > again people will sprinkle them all over the place for the very wrong
> > reasons.
>
> And, as Thomas alludes to here, cond_resched() is not always cost free.
> Needing to call cond_resched() forces us to restructure hot paths in
> ways that results in worse performance/complex code.
>
> One example is clear_huge_page(), where removing the need to call
> cond_resched() every once in a while allows the processor to optimize
> differently.
>
> *Milan* mm/clear_huge_page x86/clear_huge_page change
> (GB/s) (GB/s)
>
> pg-sz=2MB 14.55 19.29 +32.5%
> pg-sz=1GB 19.34 49.60 +156.4%
>
> (See https://lore.kernel.org/all/20230830184958.2333078-1-ankur.a.arora@xxxxxxxxxx/)
>
> And, that's one of the simpler examples from mm. We do this kind of arbitrary
> batching all over the place.
>
> Or see the filemap_read() example that Linus gives here:
> https://lore.kernel.org/lkml/CAHk-=whpYjm_AizQij6XEfTd7xvGjrVCx5gzHcHm=2Xijt+Kyg@xxxxxxxxxxxxxx/#t

I already agree that some cond_resched() calls can cause difficulties.
But that is not the same as proving that they *all* should be removed.

Thanx, Paul