Re: [PATCH, RFC] v4 scalable classic RCU implementation

From: Paul E. McKenney
Date: Tue Sep 16 2008 - 14:23:21 EST


On Tue, Sep 16, 2008 at 07:48:00PM +0200, Manfred Spraul wrote:
> Paul E. McKenney wrote:
>>
>>> That means an O(NR_CPUS) loop with disabled local interrupts :-(
>>> Is that correct?
>>>
>>
>> With the definition of "O()" being the worst-case execution time, yes.
>> But this worst case could only happen when the system was mostly idle,
>> in which case the added overhead should not be too horribly bad.
>
> No: "was mostly running cpu_idle()". A cpu_idle() cpu could execute lots of
> irqs and softirqs.
> So the worst case would be a system with 1 cpu/node for reserved for irq
> handling.
> The "idle" cpu would be always in no_hz mode, even though it might be 100%
> busy handling irqs.
> The remaning cpus might be 100% busy handling user space.
>
> And every quiescent state will end up in that O(NR_CPUS) loop.

Good point!

Indeed, if you had a 1024-CPU box acting as (say) a router/hub using
the Linux-kernel protocol stacks with no user-mode processing, then
you could indeed have the system mostly busy with no user-space code
running, and thus no quiescent states.

However, last I checked, almost all 1024-CPU boxes run HPC workloads
mostly in user mode, so this scenario would not occur. However, again,
if it does come up, I would add an additional level of state machine
to the force_quiescent_state() family of functions, so that the scan
would be done incrementally. Perhaps arranging for CPU groups to be
scanned by CPUs within that group.

But again, I don't want to take that step until I see someone actually
needing it. Maybe the Vyatta guys will be there sooner than I think,
but...

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/