Re: [PATCH V2] arch_topology: Clear LLC sibling when cacheinfo teardown

From: Song Shuai
Date: Thu Mar 16 2023 - 06:31:17 EST


Sudeep Holla <sudeep.holla@xxxxxxx> 于2023年3月16日周四 09:29写道:
>
> On Tue, Mar 14, 2023 at 03:53:45PM +0800, Song Shuai wrote:
> > The teardown of CPUHP_AP_BASE_CACHEINFO_ONLINE now only invokes
> > free_cache_attributes() to clear share_cpu_map of cacheinfo list.
> > At the same time, clearing cpu_topology[].llc_sibling is
> > called quite late at the teardown code in hotplug STARTING section.
> >
> > To avoid the incorrect LLC sibling masks generated, move its clearing
> > right after free_cache_attributes().
> >
>
> Technically in terms of flow/timing this is correct. However I would like
> to know if you are seeing any issues without this change ?
>
> Technically, if a cpu is hotplugged out, the cacheinfo is reset first
> and then the topology. Until the cpu is removes, the LLC info in the
> topology is still valid. Also I am not sure if anything gets scheduled
> and this LLC info is utilised once the teardown of CPUHP_AP_BASE_CACHEINFO_ONLINE
> has started.

There is no visible issue in the entire offline process(eg: echo 0 > online).

However, when I hotplugged out the CPU into the state before CACHEINFO_ONLINE on
my kernel with the CONFIG_CPU_HOTPLUG_STATE_CONTROL configured,
the share_cpu_map had been updated but llc_sibling had not, which
would result in a trivial issue:

At the end of stepped hotplugging out, the cpuset_hotplug_work would
be flushed and then sched domain would be rebuilt
where the **cpu_coregroup_mask** in sched_domain_topology got
incorrect llc_sibling,
but the result of rebuilding was correct due to the protection of
cpu_active_mask.

The stepped hotplugging may not be used in the production environment,
but the issue existed.
Even in the entire offline process, it's possible that a future user
gets wrong the llc_sibling when accessing it concurrently or right
after the teardown of CACHEINFO_ONLINE.

>
> So I am trying to understand if we really need this change. Please let me
> know if I am missing anything here.
>
> --
> Regards,
> Sudeep



--
Thanks,
Song