RE: [patch V3 27/40] x86/cpu: Provide a sane leaf 0xb/0x1f parser

From: Thomas Gleixner
Date: Fri Sep 01 2023 - 03:45:35 EST


Len!

On Fri, Sep 01 2023 at 03:09, Len Brown wrote:
>> Conceptually _all_ levels exist, but the ones which occupy zero bits
>> have no meaning. Neither have the unknown levels if they should
>> surface at some point.
>>
>> So as they _all_ exist the logical consequence is that even those which occupy zero bits have an ID.
>>
>> Code which is interested in information which depends on the enumeration of the level must obviously do:
>>
>> if (level_exists(X))
>> analyse_level(X)
>>
>> Whether you express that via an invalid level ID or via an explicit
>> check for the level is an implementation detail.
>
> Thank you for acknowledging that a level with a shift-width of 0 does
> not exist, and thus an id for that level has no meaning.

Even if the level is enumerated then there is no implicit meaning
attached per se. It's only relevant when there is a documented
relationship between the enumeration and secondary information attached
to it. Making implicit general assumptions about the meaning of an
enumeration is just not possible,

> One could argue that except for package_id and core_id, which always
> exist, maintainable code would *always* check that a level exists
> before doing *anything* with its level_id. Color me skeptical of an
> implementation that does otherwise...

We have that today, no?

> So what are you proposing with the statement that "conceptually _all_
> levels exist"?

We need a consistent view on the topology and the only consistent view
is mathematical. Which means that a shift 0 element obviously has size
one because of size = 1 << SHIFT.

As a consequence these non-enumerated levels have an ID too, which in
turn makes the view on the topology consistent and independent of the
actually enumerated levels.

>> The problem of the current implementation is not that the die ID is
>> automatically assigned. The problem is at the usage sites which
>> blindly assume that there must be a meaning. That's a completely
>> different issue and has absolutely nothing to do with purely
>> mathematical deduced ID information at any given level.
>
> I agree that the code that exports the die_id attributes in topology
> sysfs should not do so when the die_id is meaningless.

The problem is not the fact that die_id is exposed. The problem is that
the meta information which allows to deduce meaning is not exposed along
with it. The fact that the exposure was half thought out makes is
slightly harder to correct that mistake, but I'm not yet convinced that
non-exposure is the correct answer in general.

> Ps. It is a safe bet that new levels will "surface at some point".
> For example, DieGrp surfaced this summer w/o any prior consultation
> with the Linux team. But even if they did consult us and gave us the
> ideal 1-year before-hardware advance notice, and even if we
> miraculously added support in 0 time, we would still be 2-years late
> to prescriptively recognize this new level -- as our enterprise
> customers routinely run 3-year-old kernels.

That's a strawman as the enterprise people backport the world and some
more. So if there is timely upstream support then it will turn up in the
frankenkernels in time too. Arguably we could even backport the new
magic level ID to stable kernels as well as we do with other important
hardware related minimal addons.

> This is why it is mandatory that our code be resilient to the
> insertion of additional future levels. I think it can be -- as long
> as we continue to use globally unique id's for all levels (IIR, only
> core_id is not globally unique today) and do _nothing_ with levels
> that have a 0 shift-width.

Die ID is relative too for no real good reason. Inside the kernel core
ID is not really required to be relative either.

Implementation wise it's just wrong to store this information in
cpu_info instead of doing a runtime evaluation of the topology
information, which allows to chose between global and relative IDs
depending on the requirements of the particular usage site.

The primary usage of these IDs is for initialization and everything
which needs this for hotpath usage converts it into a use case specific
cached representation anyway because accessing per cpu variables in a
hotpath is suboptimal at best.

Thanks,

tglx