Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

From: Moger, Babu
Date: Mon Mar 20 2023 - 15:53:44 EST


Hi James,

On 3/20/23 12:10, James Morse wrote:
> Hi Babu,
>
> On 02/03/2023 20:24, Babu Moger wrote:
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
>
> On x86. Any other architecture is going to have a hard time supporting this.
>
>
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
>
> By comparing it with what? Unless user-space can see into the hardware, resctrl is the
> only gateway to this stuff. What difference does the allocated value here make?
>
> Could you elaborate on what issues this can help debug?

While ago, we had an issue with one of the RMID's event reporting. There
were numerous active RMIDs on the system. As a kernel developer, we
couldn't pinpoint which RMID was reporting wrong information. That
information was important for hardware guys to debug further. We had to
patch the kernel to print that information for debugging. This is one of
the cases.

>
>
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
>
> Er. Please don't expose this to user-space!
> MPAM has no equivalent value to RMID, so whatever this is for, can't work on MPAM.
>
>
> Where I have needed this value for MPAM is to pass the closid/rmid to another kernel
> interface. Because the user-space interface needs to be architecture agnostic, I proposed
> it as a u64 called 'id' that each architecture can encode/decode as appropriate. [0]
>
> To prevent user-space trying to base anything on the raw closid/rmid values, I went as far
> as obfuscating them with a random value picked at boot, to ensure scripts always read the
> current value when passing the control/monitor group.
>
>
> I'm curious what the raw value is useful for.

It is mostly for debugging when there are issues.

I think we need to have a way to print generic as well as architecture
specific details.

Thanks
Babu
>
>
> Thanks,
>
> James
>
> [0]
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.2&id=d568cf2ba58b7c4970ce41a8d4d6224e285a177e
>
>
>
>