Re: [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU

From: Naveen N. Rao
Date: Tue Feb 25 2020 - 05:22:46 EST


Gautham R Shenoy wrote:
On Fri, Feb 21, 2020 at 10:50:12AM -0600, Nathan Lynch wrote:
"Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx> writes:
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index 80a676d..5b4b450 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -19,6 +19,7 @@
> #include <asm/smp.h>
> #include <asm/pmc.h>
> #include <asm/firmware.h>
> +#include <asm/idle.h>
> #include <asm/svm.h>
> > #include "cacheinfo.h"
> @@ -733,6 +734,42 @@ static void create_svm_file(void)
> }
> #endif /* CONFIG_PPC_SVM */
> > +static void read_idle_purr(void *val)
> +{
> + u64 *ret = (u64 *)val;

No cast from void* needed.

Will fix this. Thanks.



> +
> + *ret = read_this_idle_purr();
> +}
> +
> +static ssize_t idle_purr_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct cpu *cpu = container_of(dev, struct cpu, dev);
> + u64 val;
> +
> + smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
> + return sprintf(buf, "%llx\n", val);
> +}
> +static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
> +
> +static void read_idle_spurr(void *val)
> +{
> + u64 *ret = (u64 *)val;
> +
> + *ret = read_this_idle_spurr();
> +}
> +
> +static ssize_t idle_spurr_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct cpu *cpu = container_of(dev, struct cpu, dev);
> + u64 val;
> +
> + smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
> + return sprintf(buf, "%llx\n", val);
> +}
> +static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);

It's regrettable that we have to wake up potentially idle CPUs in order
to derive correct idle statistics for them, but I suppose the main user
(lparstat) of these interfaces already is causing this to happen by
polling the existing per-cpu purr and spurr attributes.

So now lparstat will incur at minimum four syscalls and four IPIs per
CPU per polling interval -- one for each of purr, spurr, idle_purr and
idle_spurr. Correct?

Yes, it is unforunate that we will end up making four syscalls and
generating IPI noise, and this is something that I discussed with
Naveen and Kamalesh. We have the following two constraints:

1) These values of PURR and SPURR required are per-cpu. Hence putting
them in lparcfg is not an option.

2) sysfs semantics encourages a single value per key, the key being
the sysfs-file. Something like the following would have made far more
sense.

cat /sys/devices/system/cpu/cpuX/purr_spurr_accounting
purr:A
idle_purr:B
spurr:C
idle_spurr:D

There are some sysfs files which allow something like this. Eg: /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state

Thoughts on any other alternatives?

Umm... procfs?
/me ducks




At some point it's going to make sense to batch sampling of remote CPUs'
SPRs.

How did you mean this? It looks like we first need to provide a separate user interface, since with the existing sysfs interface providing separate files, I am not sure if we can batch such reads.


- Naveen