RE: [PATCH v4] x86/resctrl: Fix miscount of bandwidth event when reactivating previously Unavailable RMID

From: Luck, Tony

Date: Mon Oct 13 2025 - 11:35:28 EST


> > The behavior of the counter is different on Intel where there are enough
> > counters backing the RMID and the "Unavailable" bit is not set when counter
> > starts counting but instead the counter returns "0". For example, when

Note that the h/w counter doesn't really return "0" (except for the first time
after CPU reset).

> > running equivalent of "step 1" on an Intel system it looks like:
> >
> > # cd /sys/fs/resctrl
> > # mkdir mon_groups/test1

While making the directory mon_add_all_files() does this:

if (!do_sum && resctrl_is_mbm_event(mevt->evtid))
mon_event_read(&rr, r, d, prgrp, &d->hdr.cpu_mask, mevt->evtid, true);

Which in __mon_event_count() does:

if (rr->first) {
if (rr->is_mbm_cntr)
resctrl_arch_reset_cntr(rr->r, rr->d, closid, rmid, cntr_id, rr->evtid);
else
resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid);
m = get_mbm_state(rr->d, closid, rmid, rr->evtid);
if (m)
memset(m, 0, sizeof(struct mbm_state));
return 0;
}

If you dig into resctrl_arch_reset_rmid() you will see that it reads the h/w counter and
then that becomes the start point for subsequent values reported when a user reads
from the resctrl event file.

> > # echo $$ > mon_groups/test1/tasks
> > # cat mon_groups/test1/mon_data/*/mbm_total_bytes
> > 0
> > 1835008
> >
>
> Thanks. That is good to know.

-Tony