Re: [PATCH 1/5] cacheinfo: Set cache 'id' based on DT data
From: James Morse
Date: Fri Jun 27 2025 - 12:42:02 EST
Hi Jonathan, Rob,
On 23/06/2025 15:18, Rob Herring wrote:
> On Tue, Jun 17, 2025 at 11:03 AM Jonathan Cameron
> <Jonathan.Cameron@xxxxxxxxxx> wrote:
>> On Fri, 13 Jun 2025 13:03:52 +0000
>> James Morse <james.morse@xxxxxxx> wrote:
>>> From: Rob Herring <robh@xxxxxxxxxx>
>>>
>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
>>> cache 'id'. This will provide a stable id value for a given system. As
>>> we need to check all possible CPUs, we can't use the shared_cpu_map
>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
>>> we have to walk all CPU nodes and then walk cache levels.
>>
>> Is it ok for these to match for different levels? I've no idea for
>> these use cases.
> Yes. The 'id' is per level, not globally unique.
Documented here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/resctrl.rst#n482
The values are unique per-level, but also sparse, meaning they could be unique
system-wide. This is what will happen on arm64 ACPI platforms because the PPTT value gets
exposed directly.
>>> The cache_id exposed to user-space has historically been 32 bits, and
>>> is too late to change. Give up on assigning cache-id's if a CPU h/w
>>> id greater than 32 bits is found.
>> Mainly a couple of questions for Rob on the fun of scoped cleanup being
>> used for some of the iterators in a similar fashion to already
>> done for looping over child nodes etc.
This is mostly over my head!
>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>> index cf0d455209d7..9888d87840a2 100644
>>> --- a/drivers/base/cacheinfo.c
>>> +++ b/drivers/base/cacheinfo.c
>>> @@ -183,6 +184,37 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>>> return of_property_read_bool(np, "cache-unified");
>>> }
>>>
>>> +static void cache_of_set_id(struct cacheinfo *this_leaf, struct device_node *np)
>>> +{
>>> + struct device_node *cpu;
>>> + u32 min_id = ~0;
>>> +
>>> + for_each_of_cpu_node(cpu) {
>>
>> Rob is it worth a scoped variant of this one? I've come across
>> this a few times recently and it irritates me but I didn't feel
>> there were necessarily enough cases to bother. With one more
>> maybe it is time to do it (maybe 10+ from a quick look)_.
>
> My question on all of these (though more so for drivers), is why are
> we parsing CPU nodes again? Ideally, we'd parse the CPU and cache
> nodes only once and the kernel would provide the necessary info.
>
> Take drivers/clk/mvebu/ap-cpu-clk.c for example. The code there really
> just needs to know if there are 2 or 4 possible CPUs or what is the
> max physical CPU id (If CPU #1 could be not present).
>
>>> + struct device_node *cache_node __free(device_node) = of_find_next_cache_node(cpu);
>>> + u64 id = of_get_cpu_hwid(cpu, 0);
>>> +
>>> + if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
>>> + of_node_put(cpu);
>>> + return;
>>> + }
>>> + while (1) {
>>
>> for_each_of_cache_node_scoped() perhaps? With the find already defined this would end
>> up something like the following. Modeled on for_each_child_of_node_scoped.
>
> That seems like an invitation for someone to parse the cache nodes
> themselves rather than use cacheinfo. Plus, there are multiple ways we
> could iterate over cache nodes. Is it just ones associated with a CPU
> or all cache nodes or all cache nodes at a level?
>
> That being said, I do find the current loop a bit odd with setting
> 'prev' pointer which is then never explicitly used. We're still having
> to worry about refcounting, but handling it in a less obvious way.
>
>> #define for_each_of_cache_node_scoped(cpu, cache) \
>> for (struct device_node *cache __free(device_node) = \
>> of_find_next_cache_node(cpu); cache != NULL; \
>> cache = of_find_next_cache_node(cache))
>>
>> for_each_of_cpu_node_scoped(cpu) {
>> u64 id = of_get_cpu_hwid(cpu, 0);
>>
>> if (FIELD_GET(GENMASK_ULL(63, 32), id))
>> return;
>> for_each_of_cache_node_scoped(cpu, cache_node) {
>> if (cache_node == np) {
>> min_id = min(min_id, id);
>> break;
>> }
>> }
>> }
>>
>>> + if (!cache_node)
>>> + break;
>>> + if (cache_node == np && id < min_id) {
>>
>> Why do you carry on if id >= min_id? Id isn't changing. For that
>> matter why not do this check before iterating the caches at all?
>
> You're right, no need. There's no need to handle the id in the loop at
> all, we just need to match the cache node. So perhaps just a helper:
>
> static bool match_cache_node(struct device_node *cpu, const struct
> device_node *cache_node)
> {
> for (struct device_node *cache __free(device_node) =
> of_find_next_cache_node(cpu); cache != NULL;
> cache = of_find_next_cache_node(cache)) {
> if (cache == cache_node)
> return true;
> }
> return false;
> }
>
> And then the cpu loop would have:
>
> if (match_cache_node(cpu, cache_node))
> min_id = min(min_id, id);
Done,
Thanks!
James