Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

From: Shivappa Vikas
Date: Tue Jan 17 2017 - 21:40:39 EST




On Tue, 17 Jan 2017, Thomas Gleixner wrote:

On Fri, 6 Jan 2017, Vikas Shivappa wrote:
Cqm(cache quality monitoring) is part of Intel RDT(resource director
technology) which enables monitoring and controlling of processor shared
resources via MSR interface.

We know that already. No need for advertising this over and over.

Below are the issues and the fixes we attempt-

Emphasis on attempt.

- Issue(1): Inaccurate data for per package data, systemwide. Just prints
zeros or arbitrary numbers.

Fix: Patches fix this by just throwing an error if the mode is not supported.
The modes supported is task monitoring and cgroup monitoring.
Also the per package
data for say socket x is returned with the -C <cpu on socketx> -G cgrpy option.
The systemwide data can be looked up by monitoring root cgroup.

Fine. That just lacks any comment in the implementation. Otherwise I would
not have asked the question about cpu monitoring. Though I fundamentaly
hate the idea of requiring cgroups for this to work.

If I just want to look at CPU X why on earth do I have to set up all that
cgroup muck? Just because your main focus is cgroups?

The upstream per cpu data is broken because its not overriding the other task event RMIDs on that cpu with the cpu event RMID.

Can be fixed by adding a percpu struct to hold the RMID thats affinitized
to the cpu, however then we miss all the task llc_occupancy in that - still evaluating it.


- Issue(2): RMIDs are global and dont scale with more packages and hence
also run out of RMIDs very soon.

Fix: Support per pkg RMIDs hence scale better with more
packages, and get more RMIDs to use and use when needed (ie when tasks
are actually scheduled on the package).

That's fine, just the implementation is completely broken

- Issue(5): CAT and cqm/mbm write the same PQR_ASSOC_MSR seperately
Fix: Integrate the sched in code and write the PQR_MSR only once every switch_to

Brilliant stuff that. I bet that the seperate MSR writes which we have now
are actually faster even in the worst case of two writes.

[PATCH 02/12] x86/cqm: Remove cqm recycling/conflict handling

That's the only undisputed and probably correct patch in that whole series.

Executive summary: The whole series except 2/12 is a complete trainwreck.

Major issues:

- Patch split is random and leads to non compilable and non functional
steps aside of creating a unreviewable mess.

Will split as per the comments into divisions that are compilable without warnings. The series was tested for compile and build but wasnt for warnings. (except for the int if_cgroup_event which you pointed..)


- The core code is updated with random hooks which claim to be a generic
framework but are completely hardcoded for that particular CQM case.

The introduced hooks are neither fully documented (semantical and
functional) nor justified why they are required.

- The code quality is horrible in coding style, technical correctness and
design.

So how can we proceed from here?

I want to see small patch series which change a certain aspect of the
implementation and only that. They need to be split in preparatory changes,
which refactor code or add new interfaces to the core code, and the actual
implementation in CQM/MBM.

Appreciate your time for review and all the feedback.

Will plan to send a smaller patch series which includes the following contents at first and then follow up other series:

- the current 2/12 - add the per package RMID with the fixes mentioned.
- The reuse patch which reuses the RMIDs that are freed from events.
- Patch to indicate error if RMID was still not available.
- Any relavant sched and hot cpu updates.

That way its a functional set where tasks would be able to use different RMIDs on different packages and provide correct data for 'task events'. Either the data would be correct or would indicate an error that we were limited by h/w RMIDs.
Is that a reasonable set to target ?

Then followup with different series to get correct data for 'cgroup events' and support cgroup and task together and other issues, system/ per cpu events etc.

Thanks,
Vikas


Each of the patches must have a proper changelog explaining the WHY and if
required the semantical properties of a new interface.

Each of the patches must compile without warnigns/errors and be fully
functional vs. the changes they do.

Any attempt to resend this disaster as a whole will be NACKed w/o even
looking at it. I've wasted enough time with this wholesale approach in the
past and it did not work out. I'm not going to play that wasteful game
over and over again.

Thanks,

tglx