Re: [GIT PULL] Introduce housekeeping subsystem v4

From: Frederic Weisbecker
Date: Fri Oct 20 2017 - 10:29:08 EST


2017-10-20 10:17 UTC+02:00, Ingo Molnar <mingo@xxxxxxxxxx>:
> I mean code like:
>
> triton:~/tip> git grep on_each_cpu mm
> mm/page_alloc.c: * cpu to drain that CPU pcps and on_each_cpu_mask
> mm/slab.c: on_each_cpu(do_drain, cachep, 1);
> mm/slub.c: on_each_cpu_cond(has_cpu_slab, flush_cpu_slab, s, 1,
> GFP_ATOMIC);
> mm/vmstat.c: err = schedule_on_each_cpu(refresh_vm_stats);
>
> is something we want to execute on 'housekeeping CPUs' as well, to not
> disturb the
> isolated CPUs, right?

I see, so indeed that's the kind of thing we want to also confine to
housekeeping as
well whenever possible but these cases require special treatment that need to be
handled by the subsystem in charge. For example vmstat has the vmstat_sheperd
thing which allows to drive those timers adaptively on demand to make sure that
userspace isn't interrupted. The others will likely need some similar treatment.

For now I only see vmstat having such a feature and it acts
transparently. There is
also the LRU flush (IIRC) which needs to be called for example before
returning to
userspace to avoid IPIs. Such things may indeed need special treatment. With the
current patchset it could be a housekeeping flag.

>
> I.e. right now most (or all) of your patchset could be done using the
> 'global_time_*()' (or so) naming - I just wanted to mention that work
> related to
> global timeline is not the only jobs that housekeeping CPUs will have to
> eventually execute.

I'd rather talk about CPU affinity than global time. For example
timers, watchdog, idle load balancing are about periodic events but
workqueues (not the deffered ones), domain isolation (isolcpus), NAPI
polling are perhaps more event driven.

But indeed global timeline based periodic events are not all of what
housekeeping needs to care about.

>
>> > I don't know to what extent it makes sense to formalize and unify these
>> > facilities: it's certain that the (former) housekeeping CPU mask should
>> > be shared
>> > by these two facilities: the CPU executing global time callbacks
>> > periodically
>> > should be one of the CPUs that execute double-async CPU callbacks.
>> >
>> > But by separating all this functionality into these two categories, it's
>> > already
>> > much easier to me to argue about which bit does what and why.
>>
>> Note that some housekeeping concepts may not fall into any of these
>> categories. For example domain isolation.
>
> Could you describe domain isolation?

This is the isolcpus thing which excludes a set of CPUs from the whole
domain tree. That one
doesn't fall into any category we talked about.

In fact, CPU affinity is the only high level concept I found to gather
all these housekeeping
elements.

Perhaps I should use "cpu_isolation" instead of "housekeeping" naming.

Thanks.