Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

From: Michal Hocko
Date: Mon Aug 17 2020 - 04:48:03 EST


On Sat 15-08-20 01:14:53, Thomas Gleixner wrote:
[...]
> For normal operations a couple of pages which can be preallocated are
> enough. What you are concerned of is the case where you run out of
> pointer storage space.
>
> There are two reasons why that can happen:
>
> 1) RCU call flooding
> 2) RCU not being able to run and mop up the backlog
>
> #1 is observable by looking at the remaining storage space and the RCU
> call frequency
>
> #2 is uninteresting because it's caused by RCU being stalled / delayed
> e.g. by a runaway of some sorts or a plain RCU usage bug.
>
> Allocating more memory in that case does not solve or improve anything.
>
> So the interesting case is #1. Which means we need to look at the
> potential sources of the flooding:
>
> 1) User space via syscalls, e.g. open/close
> 2) Kernel thread
> 3) Softirq
> 4) Device interrupt
> 5) System interrupts, deep atomic context, NMI ...
>
> #1 trivial fix is to force switching to an high prio thread or a soft
> interrupt which does the allocation
>
> #2 Similar to #1 unless that thread loops with interrupts, softirqs or
> preemption disabled. If that's the case then running out of RCU
> storage space is the least of your worries.
>
> #3 Similar to #2. The obvious candidates (e.g. NET) for monopolizing a
> CPU have loop limits in place already. If there is a bug which fails
> to care about the limit, why would RCU care and allocate more memory?
>
> #4 Similar to #3. If the interrupt handler loops forever or if the
> interrupt is a runaway which prevents task/softirq processing then
> RCU free performance is the least of your worries.
>
> #5 Clearly a bug and making RCU accomodate for that is beyond silly.
>
> So if call_rcu() detects that the remaining storage space for pointers
> goes below the critical point or if it observes high frequency calls
> then it simply should force a soft interrupt which does the allocation.
>
> Allocating from softirq context obviously without holding the raw lock
> which is used inside call_rcu() is safe on all configurations.
>
> If call_rcu() is forced to use the fallback for a few calls until this
> happens then that's not the end of the world. It is not going to be a
> problem ever for the most obvious issue #1, user space madness, because
> that case cannot delay the softirq processing unless there is a kernel
> bug which makes again RCU free performance irrelevant.

Yes, this makes perfect sense to me! I really do not think we want to
optimize for a userspace abuse to allow complete pcp allocator memory
depletion (or a control in a worse case).

Thanks!
--
Michal Hocko
SUSE Labs