Re: [PATCH v2] ASoC: cros_ec_codec: Log results when EC commands fail

From: Guenter Roeck
Date: Fri Jul 03 2020 - 11:58:50 EST


On Fri, Jul 3, 2020 at 3:56 AM Enric Balletbo i Serra
<enric.balletbo@xxxxxxxxxxxxx> wrote:
>
> Hi Yu-Hsuan,
>
> On 3/7/20 11:40, Yu-Hsuan Hsu wrote:
> > Enric Balletbo i Serra <enric.balletbo@xxxxxxxxxxxxx> æ 2020å7æ3æ éä äå5:19åéï
> >>
> >> Hi Yu-Hsuan,
> >>
> >> On 3/7/20 10:48, Yu-Hsuan Hsu wrote:
> >>> Enric Balletbo i Serra <enric.balletbo@xxxxxxxxxxxxx> æ 2020å7æ3æ éä äå4:38åéï
> >>>>
> >>>> Hi Yu-Hsuan,
> >>>>
> >>>> Thank you for your patch
> >>>>
> >>>> On 3/7/20 9:19, Yu-Hsuan Hsu wrote:
> >>>>> Log results of failed EC commands to identify a problem more easily.
> >>>>>
> >>>>> Replace cros_ec_cmd_xfer_status with cros_ec_cmd_xfer because the result
> >>>>> has already been checked in this function. The wrapper is not needed.
> >>>>>
> >>>>
> >>>> Nack, we did an effort to remove all public users of cros_ec_cmd_xfer() in
> >>>> favour of cros_ec_cmd_xfer_status() and you are reintroducing again. You can do
> >>>> the same but using cros_ec_cmd_xfer_status(). In fact, your patch will not build
> >>>> on top of the upcoming changes.
> >>> Thanks! But I have a question about implementing it. Does it look like
> >>> the one below?
> >>> ret = cros_ec_cmd_xfer_status(ec_dev, msg);
> >>> if (ret < 0) {
> >>
> >> In this case will already print an error.
> >>
> >> What are you trying to achieve?
> >>
> >> If the only reason is of this patch is print a message you should either, or
> >> enable dynamic printk and enable dev_dbg or event better use the kernel trace
> >> functionality. There is no need to be more verbose.
> >>
> >> Example:
> >> $ echo 1 > /sys/kernel/debug/tracing/events/cros_ec/enable
> >> $ cat /sys/kernel/debug/tracing/trace
> >>
> >> 369.416372: cros_ec_request_start: version: 0, command: EC_CMD_USB_PD_POWER_INFO
> >> 369.420528: cros_ec_request_done: version: 0, command:
> >> EC_CMD_USB_PD_POWER_INFO, ec result: EC_RES_SUCCESS, retval: 16
> >>
> >> Cheers,
> >> Enric
> >>
> > Thank Enric,
> >
> > The situation is that some users encountered errors on ChromeBook.
>
> And, aren't you able to reproduce the issue?
>
>
> > From their feedback reports, we only get the message like
> > 'cros-ec-codec GOOG0013:00: ASoC: Failed to set DAI format: -71'.
> > We know that -71 is -EPROTO but it is not clear enough for us to find
> > out the root cause. That's why we want the detail of the result.
>
>
> If I am not mistaken this ends calling i2s_rx_set_daifmt() into the EC firmware,
> if the result is -EPROTO that means is not returning EC_RES_SUCCESS, so there
> are few options:
>
> if (i2s_rx_enabled)
> return EC_RES_BUSY;
>
> if (daifmt >= EC_CODEC_I2S_RX_DAIFMT_COUNT)
> return EC_RES_INVALID_PARAM;
>
> if (audio_codec_i2s_rx_set_daifmt(daifmt) != EC_SUCCESS)
> return EC_RES_ERROR;
>
> > Because the situation happens on users' side, it is not possible for
> > them to enable kernel trace (ChromeOS does not allow users to touch
> > kernel).
> >
>
> Are you sure that when you know the error code you'll find the root cause
> (without adding more prints)? There is only three possibilities? You can't start
> adding prints just to debug a user issue because you don't allow to be more
> verbose. I understand that might help you but is not the way to go.
>
> You should really reproduce the issue yourself an use actual debug
> tools/prints./traces.
>

Another possibility would be to change cros_ec_cmd_xfer_status() to
return a more granular error code, such as -EINVAL for
EC_RES_INVALID_PARAM, -EBUSY for EC_RES_BUSY, -EINPROGRESS for
EC_RES_IN_PROGRESS, -ETIMEDOUT for EC_RES_TIMEOUT, -EOVERFLOW for
EC_RES_OVERFLOW, -ENODATA for EC_RES_UNAVAILABLE, and so on.

However, it appears that the various low level functions already
replace various EC error codes with a blank EC_RES_ERROR. No amount of
logging will tell us what exactly went wrong in those functions. Lucky
for us, audio_codec_i2s_rx_set_daifmt() only ever returns EC_SUCCESS,
so we know that the problem is either that i2s_rx_enabled is true or
that daifmt is too large. None of those really warrants more verbose
logging.

>From the context, my personal bet is that i2s_rx_enabled is true: I
don't immediately see how disabling it is enforced before trying to
set the DAI format, and I don't see how "daifmt >=
EC_CODEC_I2S_RX_DAIFMT_COUNT" can ever be true.

Guenter


Guenter

> Cheers,
> Enric
>
> > The other way we thought is changing dev_dbg to dev_err in
> > cros_ec_cmd_xfer_status. But we are not sure whether it is also an
> > error for other usages.
> >
> >>> if (ret == -EPROTO)
> >>> dev_err(..., msg->result)
> >>> goto error;
> >>> }
> >>> I'm not sure whether it makes sense to check ret == -EPROTO here.
> >>>
> >>>>
> >>>>> Signed-off-by: Yu-Hsuan Hsu <yuhsuan@xxxxxxxxxxxx>
> >>>>> ---
> >>>>> sound/soc/codecs/cros_ec_codec.c | 9 ++++++++-
> >>>>> 1 file changed, 8 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/sound/soc/codecs/cros_ec_codec.c b/sound/soc/codecs/cros_ec_codec.c
> >>>>> index 8d45c628e988e..a4ab62f59efa6 100644
> >>>>> --- a/sound/soc/codecs/cros_ec_codec.c
> >>>>> +++ b/sound/soc/codecs/cros_ec_codec.c
> >>>>> @@ -90,10 +90,17 @@ static int send_ec_host_command(struct cros_ec_device *ec_dev, uint32_t cmd,
> >>>>> if (outsize)
> >>>>> memcpy(msg->data, out, outsize);
> >>>>>
> >>>>> - ret = cros_ec_cmd_xfer_status(ec_dev, msg);
> >>>>> + ret = cros_ec_cmd_xfer(ec_dev, msg);
> >>>>> if (ret < 0)
> >>>>> goto error;
> >>>>>
> >>>>> + if (msg->result != EC_RES_SUCCESS) {
> >>>>> + dev_err(ec_dev->dev, "Command %d failed: %d\n", cmd,
> >>>>> + msg->result);
> >>>>> + ret = -EPROTO;
> >>>>> + goto error;
> >>>>> + }
> >>>>> +
> >>>>> if (insize)
> >>>>> memcpy(in, msg->data, insize);
> >>>>>
> >>>>>