Re: [PATCH] platform/chrome: cros_ec_proto: Convert EC error codes to Linux error codes

From: Guenter Roeck
Date: Mon Jul 06 2020 - 17:38:03 EST


On 7/6/20 1:07 PM, Prashant Malani wrote:
> On Mon, Jul 6, 2020 at 12:41 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>>
>> On Mon, Jul 06, 2020 at 11:52:30AM -0700, Prashant Malani wrote:
>>> Hi Guenter,
>>>
>>> On Sat, Jul 04, 2020 at 07:26:07AM -0700, Guenter Roeck wrote:
>>>> The EC reports a variety of error codes. Most of those, with the exception
>>>> of EC_RES_INVALID_VERSION, are converted to -EPROTO. As result, the actual
>>>> error code gets lost. Convert all EC errors to Linux error codes to report
>>>> a more meaningful error to the caller to aid debugging.
>>>>
>>>> Cc: Yu-Hsuan Hsu <yuhsuan@xxxxxxxxxxxx>
>>>> Cc: Prashant Malani <pmalani@xxxxxxxxxxxx>
>>>> Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
>>>> ---
>>>> drivers/platform/chrome/cros_ec_proto.c | 37 +++++++++++++++++++------
>>>> 1 file changed, 29 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
>>>> index 3e745e0fe092..10aa9e483d35 100644
>>>> --- a/drivers/platform/chrome/cros_ec_proto.c
>>>> +++ b/drivers/platform/chrome/cros_ec_proto.c
>>>> @@ -543,6 +543,29 @@ int cros_ec_cmd_xfer(struct cros_ec_device *ec_dev,
>>>> }
>>>> EXPORT_SYMBOL(cros_ec_cmd_xfer);
>>>>
>>>> +static const int cros_ec_error_map[] = {
>>>> + [EC_RES_INVALID_COMMAND] = -EOPNOTSUPP,
>>>> + [EC_RES_ERROR] = -EIO,
>>>> + [EC_RES_INVALID_PARAM] = -EINVAL,
>>>> + [EC_RES_ACCESS_DENIED] = -EACCES,
>>>> + [EC_RES_INVALID_RESPONSE] = -EPROTO,
>>>> + [EC_RES_INVALID_VERSION] = -ENOTSUPP,
>>>> + [EC_RES_INVALID_CHECKSUM] = -EBADMSG,
>>>> + [EC_RES_IN_PROGRESS] = -EINPROGRESS,
>>>> + [EC_RES_UNAVAILABLE] = -ENODATA,
>>>> + [EC_RES_TIMEOUT] = -ETIMEDOUT,
>>>> + [EC_RES_OVERFLOW] = -EOVERFLOW,
>>>> + [EC_RES_INVALID_HEADER] = -EBADR,
>>>> + [EC_RES_REQUEST_TRUNCATED] = -EBADR,
>>>> + [EC_RES_RESPONSE_TOO_BIG] = -EFBIG,
>>>> + [EC_RES_BUS_ERROR] = -EFAULT,
>>>> + [EC_RES_BUSY] = -EBUSY,
>>>> + [EC_RES_INVALID_HEADER_VERSION] = -EBADMSG,
>>>> + [EC_RES_INVALID_HEADER_CRC] = -EBADMSG,
>>>> + [EC_RES_INVALID_DATA_CRC] = -EBADMSG,
>>>> + [EC_RES_DUP_UNAVAILABLE] = -ENODATA,
>>>> +};
>>>> +
>>>> /**
>>>> * cros_ec_cmd_xfer_status() - Send a command to the ChromeOS EC.
>>>> * @ec_dev: EC device.
>>>> @@ -555,8 +578,7 @@ EXPORT_SYMBOL(cros_ec_cmd_xfer);
>>>> *
>>>> * Return:
>>>> * >=0 - The number of bytes transferred
>>>> - * -ENOTSUPP - Operation not supported
>>>> - * -EPROTO - Protocol error
>>>> + * <0 - Linux error code
>>>> */
>>>> int cros_ec_cmd_xfer_status(struct cros_ec_device *ec_dev,
>>>> struct cros_ec_command *msg)
>>>> @@ -566,13 +588,12 @@ int cros_ec_cmd_xfer_status(struct cros_ec_device *ec_dev,
>>>> ret = cros_ec_cmd_xfer(ec_dev, msg);
>>>> if (ret < 0) {
>>>> dev_err(ec_dev->dev, "Command xfer error (err:%d)\n", ret);
>>>> - } else if (msg->result == EC_RES_INVALID_VERSION) {
>>>> - dev_dbg(ec_dev->dev, "Command invalid version (err:%d)\n",
>>>> - msg->result);
>>>> - return -ENOTSUPP;
>>>> } else if (msg->result != EC_RES_SUCCESS) {
>>>> - dev_dbg(ec_dev->dev, "Command result (err: %d)\n", msg->result);
>>>> - return -EPROTO;
>>>> + if (msg->result < ARRAY_SIZE(cros_ec_error_map) && cros_ec_error_map[msg->result])
>>>
>>> Do we expect a case where cros_ec_error_map[msg->result] == 0?
>>>
>>
>> It seemed to be prudent to assume that this code is not going to be
>> updated whenever a new EC error code is added. Doing nothing would
>> risk returning 0, and addding WARN_ON or dev_warn seemed excessive.
>> Having said that, I don't really have a strong opinion one way
>> or another, and I'll be happy to change the code to whatever people
>> think is appropriate.
>
> Thanks for providing the rationale. I think if a new EC error code is
> added (and this array isn't updated),
> msg->result < ARRAY_SIZE(cros_ec_error_map) would return false, and
> the code block would return -EPROTO.
>

Some scenarios:

Developer 1 adds EC_RES_SOME_ERROR, and does not update the array.
Developer 2 adds EC_RES_SOME_OTHER_ERROR and updates the array, but
does not realize that EC_RES_SOME_ERROR is missing as well, and does
not add it.
Developer 3 adds two (or more) error codes, and does not update the
array. Someone else later finds a -EPROTO return code and adds the
necessary translation to the array. That translation happens to be
for the last error code. The developer doing that does not realize
that other error codes are missing as well, or does not realize
the impact, and does not add translations for the other missing
error codes.

Overall there are too many situations where this can go wrong for me
to trust that it never will.

> I'll defer to the maintainer's opinion(s), but I think we can remove
> the condition after '&&'.
>

I thought about it, but I find that I don't feel comfortable with
doing that. If that is what is asked for, would you mind providing
a separate patch which doesn't have my name on it ?

Thanks,
Guenter