Re: [PATCH MPAM 6.16.rc1] arm_mpam: Enforce to recalculate mbw_min on configuration

From: Gavin Shan
Date: Mon Jul 21 2025 - 00:59:23 EST


Hi James,

On 7/14/25 8:15 PM, James Morse wrote:
On 26/06/2025 10:52, Gavin Shan wrote:
mpam_feat_mbw_max is exposed to user by resctrlfs, but mpam_feat_mbw_min
should be recalculated based on the maximal memory bandwidth in every
configuration, which has been missed unfortunately. When a new group is
created, the default and maximal memory bandwidth percentage (100%) is
configured, the configuration instance (struct mpam_config) on the stack
is updated, including the minimal and maximal memory bandwidth. It's
copied to the domain's configuration instance. On next time when user
tries to configure by writing 'schemata', the minimal memory bandwidth
isn't never recalculated because mpam_feat_mbw_min has been seen in the
configuration, which inherits from the domain's instance.

For example, the value of register MPAMCFG_MBW_MIN is never changed no
matter what configuration is set.

Thanks for catching this. I think this was introduced by the patch that picked
the component copy of the configuration, modified it - and passed it back into
the driver. That had the additional side effect of changes going missing as the
mpam_update_config() had the same config twice...

I fixed that shortly after rc1, with that fixed I don't think this can happen -
resctrl never sets the 'min', meaning mpam_extent_config() will always regenerate
the value.

This shouldn't happen with the latest version of the tree.
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=mpam/snapshot/v6.16-rc4


The issue still exists in v6.16-rc4 and v6.16-rc5. Here are the dumped registers
on the varied MBW configurations, and MPAMCFG_MBW_MIN doesn't alter.

# uname -r
6.16.0-rc5-gavin
# mount -tresctrl none /sys/fs/resctrl
# mkdir -p /sys/fs/resctrl/all
# mkdir -p /sys/fs/resctrl/test
# cd /sys/fs/resctrl/test

# echo "MB:1=100" > schemata
# cat /proc/dump
MPAMCFG_PART_SEL 00000002
MAPMF_MBW_IDR 00000c07
MPAMCFG_MBW_MAX 0000ffff
MPAMCFG_MBW_MIN 0000f000

# echo "MB:1=2" > schemata
# cat /proc/dump
MPAMCFG_PART_SEL 00000002
MAPMF_MBW_IDR 00000c07
MPAMCFG_MBW_MAX 000005ff
MPAMCFG_MBW_MIN 0000f000

The problem is mpam_feat_mbw_min is always set in 'struct mpam_config'. With this,
cfg->mbw_min won't be recalculated. I guess we probably just need to simply drop
the check to that cfg->mbw_min can be recalculated every time? Could you please
fix it up in place if the resolution looks good to you, I also can post (v2) to
include the changes if you prefer. I don't know how it can be handled since the
code isn't merged to upstream yet.

static void mpam_extend_config(struct mpam_class *class, struct mpam_config *cfg)
{
:
/* The mpam_feat_mbw_min needs to be dropped */
if (mpam_has_feature(mpam_feat_mbw_max, cfg) &&
!mpam_has_feature(mpam_feat_mbw_min, cfg)) {
delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
if (cfg->mbw_max > delta)
min = cfg->mbw_max - delta;
else
min = 0;

cfg->mbw_min = max(min, min_hw_granule);
mpam_set_feature(mpam_feat_mbw_min, cfg);
}

if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, class) &&
cfg->mbw_min <= min_hw_granule) {
cfg->mbw_min = min_hw_granule + 1;
mpam_set_feature(mpam_feat_mbw_min, cfg);
}
}


diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
b/drivers/platform/arm64/mpam/mpam_devices.c
index df8730491de2..4845dcb8e601 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -3192,6 +3192,9 @@ static void mpam_extend_config(struct mpam_class *class, struct
mpam_config *cfg
u16 min, min_hw_granule, delta;
u16 max_hw_value, res0_bits;

+ if (!mpam_has_feature(mpam_feat_mbw_max, cfg))
+ return;

N.B. - The T421 quirk needs to ensure the minimum value is never set to 0 - which is what
mpam_reprogram_ris_partid() will do if there is no minimum config set.

The aim of the quirk is that on affected platforms the mbm_min will always be set, and
never be zero in any configuration. Whether it then gets applied to the hardware depends
on whether the hardware has the feature.
This makes it simpler to think about - and easier to test on platforms that aren't
affected by the errata.

I'd prefer not to to try and spot configurations where its going to matter as it will be
hard to debug the workaround going wrong!


Ok, thanks for the explanation. In that case, we shouldn't bail early. The mpam_feat_mbw_min
check in mpam_extend_config() needs to be dropped, as explained above.


@@ -3211,23 +3214,17 @@ static void mpam_extend_config(struct mpam_class *class, struct
mpam_config *cfg
*
* Resctrl can only configure the MAX.
*/
- if (mpam_has_feature(mpam_feat_mbw_max, cfg) &&
- !mpam_has_feature(mpam_feat_mbw_min, cfg)) {
- delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
- if (cfg->mbw_max > delta)
- min = cfg->mbw_max - delta;
- else
- min = 0;
+ delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
+ if (cfg->mbw_max > delta)
+ min = cfg->mbw_max - delta;
+ else
+ min = 0;

- cfg->mbw_min = max(min, min_hw_granule);
- mpam_set_feature(mpam_feat_mbw_min, cfg);
- }
+ cfg->mbw_min = max(min, min_hw_granule);

While resctrl doesn't use it - the idea is the mpam_devices interface should support any
mpam control being set. This overwrites what the user asked for.

The devices/resctrl split might look strange - but its so the resctrl ABI implications are
contained to one file, and another user of the mpam_devices interface can be added later.
(KVM in-kernel MSC emulation was one proposal, another came from people who believe they
need to change the hardware policy more frequently than they can with a syscall ... I
don't like either of these!)


hmm, it's intresting to know we can't tolerate the time consumed to finish a syscall.
The MPAM would have static configurations, meaning it's static after the configuration
is given. Could you share who is looking into MPAM virtualization? Actually, David
and I also looked into MPAM virtualization support, but we had everything emulated in
QEMU in the draft design.

Thanks,
Gavin


Thanks,

James