RE: [RFC PATCH v2] coresight: etm4x: Modify core-commit of cpu to avoid the overflow of HiSilicon ETM

From: Al Grant
Date: Fri Aug 28 2020 - 05:00:34 EST


Hi Mathieu and CS maintainers,

> Hi Liu,
>
> On Wed, Aug 19, 2020 at 04:06:37PM +0800, Qi Liu wrote:
> > When too much trace information is generated on-chip, the ETM will
> > overflow, and cause data loss. This is a common phenomenon on ETM
> > devices.
> >
> > But sometimes we do not want to lose performance trace data, so we
> > suppress the speed of instructions sent from CPU core to ETM to avoid
> > the overflow of ETM.
> >
> > Signed-off-by: Qi Liu <liuqi115@xxxxxxxxxx>
> > ---
> >
> > Changes since v1:
> > - ETM on HiSilicon Hip09 platform supports backpressure, so does not
> > need to modify core commit.
> >
> > drivers/hwtracing/coresight/coresight-etm4x.c | 43
> > +++++++++++++++++++++++++++
> > 1 file changed, 43 insertions(+)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c
> > b/drivers/hwtracing/coresight/coresight-etm4x.c
> > index 7797a57..7641f89 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x.c
> > @@ -43,6 +43,10 @@ MODULE_PARM_DESC(boot_enable, "Enable tracing
> on boot");
> > #define PARAM_PM_SAVE_NEVER 1 /* never save any state */
> > #define PARAM_PM_SAVE_SELF_HOSTED 2 /* save self-hosted state only */
> >
> > +#define CORE_COMMIT_CLEAR 0x3000
> > +#define CORE_COMMIT_SHIFT 12
> > +#define HISI_ETM_AMBA_ID_V1 0x000b6d01
> > +
> > static int pm_save_enable = PARAM_PM_SAVE_FIRMWARE;
> > module_param(pm_save_enable, int, 0444);
> > MODULE_PARM_DESC(pm_save_enable, @@ -104,11 +108,40 @@ struct
> > etm4_enable_arg {
> > int rc;
> > };
> >
> > +static void etm4_cpu_actlr1_cfg(void *info) {
> > + struct etm4_enable_arg *arg = (struct etm4_enable_arg *)info;
> > + u64 val;
> > +
> > + asm volatile("mrs %0,s3_1_c15_c2_5" : "=r"(val));
> > + val &= ~CORE_COMMIT_CLEAR;
> > + val |= arg->rc << CORE_COMMIT_SHIFT;
> > + asm volatile("msr s3_1_c15_c2_5,%0" : : "r"(val)); }
> > +
> > +static void etm4_config_core_commit(int cpu, int val) {
> > + struct etm4_enable_arg arg = {0};
> > +
> > + arg.rc = val;
> > + smp_call_function_single(cpu, etm4_cpu_actlr1_cfg, &arg, 1);
>
> Function etm4_enable/disable_hw() are already running on the CPU they are
> supposed to so no need to call smp_call_function_single().
>
> > +}
> > +
> > static int etm4_enable_hw(struct etmv4_drvdata *drvdata) {
> > int i, rc;
> > + struct amba_device *adev;
> > struct etmv4_config *config = &drvdata->config;
> > struct device *etm_dev = &drvdata->csdev->dev;
> > + struct device *dev = drvdata->csdev->dev.parent;
> > +
> > + adev = container_of(dev, struct amba_device, dev);
> > + /*
> > + * If ETM device is HiSilicon ETM device, reduce the
> > + * core-commit to avoid ETM overflow.
> > + */
> > + if (adev->periphid == HISI_ETM_AMBA_ID_V1)
>
> Do you have any documentation on this back pressure feature? I doubt this is
> specific to Hip09 platform and as such would prefer to have a more generic
> approach that works on any platform that supports it.

It's not a standard Arm architecture feature, this is a model-specific register.
Some cores may be able to throttle throughput under user control,
but this isn't standardized. It may (as in this case) be something that you
want to enable whenever you enable ETM - and, I guess, disable whenever
you disable ETM. It's a bit unclean to have model-specific code in the main
ETM driver, and names like CORE_COMMIT_CLEAR really ought to be more
specific, but I don't see that it's more ugly than the model-specific code in
e.g. arch/arm64/kernel/perf_event.c or its equivalent on other architectures.

Ideally, a core that has an inherent difficulty generating ETM at full speed,
i.e. where the ETM can't keep up with the core pipeline, would reduce
commit rate automatically, and some already do. So if this core needs it
to be done manually via a system register, we might allow that in the
same way as we might allow other core-specific actions that need to be
done to enable ETM.

There are of course issues with trace overflow at all stages up to and
including harvesting trace from buffers into perf.data (for which solutions
might involve DVFS, trace strobing, scheduling etc.), and I am assuming
this patch is not addressing those but dealing with a very specific concern
about overflow within the core and its ETM.

Al


>
> Anyone on the CS mailing list that knows what this is about?
>
> Thanks,
> Mathieu
>
> > + etm4_config_core_commit(drvdata->cpu, 1);
> >
> > CS_UNLOCK(drvdata->base);
> >
> > @@ -472,10 +505,20 @@ static void etm4_disable_hw(void *info) {
> > u32 control;
> > struct etmv4_drvdata *drvdata = info;
> > + struct device *dev = drvdata->csdev->dev.parent;
> > struct etmv4_config *config = &drvdata->config;
> > struct device *etm_dev = &drvdata->csdev->dev;
> > + struct amba_device *adev;
> > int i;
> >
> > + adev = container_of(dev, struct amba_device, dev);
> > + /*
> > + * If ETM device is HiSilicon ETM device, resume the
> > + * core-commit after ETM trace is complete.
> > + */
> > + if (adev->periphid == HISI_ETM_AMBA_ID_V1)
> > + etm4_config_core_commit(drvdata->cpu, 0);
> > +
> > CS_UNLOCK(drvdata->base);
> >
> > if (!drvdata->skip_power_up) {
> > --
> > 2.8.1
> >
> _______________________________________________
> CoreSight mailing list
> CoreSight@xxxxxxxxxxxxxxxx
> https://lists.linaro.org/mailman/listinfo/coresight