Re: [PATCH] fpga: mgr: Update the state to provide the exact error code

From: Marco Pagani
Date: Fri Feb 17 2023 - 07:06:06 EST




On 2023-02-13 06:50, Manne, Nava kishore wrote:
> Hi Marco,
>
> Please find my response inline.
>
>> -----Original Message-----
>> From: Marco Pagani <marpagan@xxxxxxxxxx>
>> Sent: Thursday, February 9, 2023 3:52 AM
>> To: Manne, Nava kishore <nava.kishore.manne@xxxxxxx>
>> Cc: Nava kishore Manne <nava.manne@xxxxxxxxxx>; mdf@xxxxxxxxxx;
>> hao.wu@xxxxxxxxx; trix@xxxxxxxxxx; yilun.xu@xxxxxxxxx; linux-
>> fpga@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
>> Subject: Re: [PATCH] fpga: mgr: Update the state to provide the exact error
>> code
>>
>>
>> On 2023-02-08 12:01, Manne, Nava kishore wrote:
>>> Hi Marco,
>>>
>>> Thanks for providing the review comments.
>>> Please find my response inline below.
>>>
>>>> -----Original Message-----
>>>> From: Marco Pagani <marpagan@xxxxxxxxxx>
>>>> Sent: Wednesday, February 8, 2023 12:04 AM
>>>> To: Nava kishore Manne <nava.manne@xxxxxxxxxx>
>>>> Cc: Manne, Nava kishore <nava.kishore.manne@xxxxxxx>;
>> mdf@xxxxxxxxxx;
>>>> hao.wu@xxxxxxxxx; trix@xxxxxxxxxx; yilun.xu@xxxxxxxxx;
>>>> linux-fpga@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
>>>> Subject: Re: [PATCH] fpga: mgr: Update the state to provide the exact
>>>> error code
>>>>
>>>>
>>>> On 2023-02-07 10:59, Nava kishore Manne wrote:
>>>>> From: Nava kishore Manne <nava.manne@xxxxxxxxxx>
>>>>>
>>>>> Up on fpga configuration failure, the existing sysfs state interface
>>>>> is just providing the generic error message rather than providing
>>>>> the exact error code. This patch extends sysfs state interface to
>>>>> provide the exact error received from the lower layer along with the
>>>>> existing generic error message.
>>>>>
>>>>> Signed-off-by: Nava kishore Manne <nava.manne@xxxxxxxxxx>
>>>>> ---
>>>>> drivers/fpga/fpga-mgr.c | 20 +++++++++++++++++++-
>>>>> include/linux/fpga/fpga-mgr.h | 2 ++
>>>>> 2 files changed, 21 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/fpga/fpga-mgr.c b/drivers/fpga/fpga-mgr.c index
>>>>> 8efa67620e21..b2d74705a5a2 100644
>>>>> --- a/drivers/fpga/fpga-mgr.c
>>>>> +++ b/drivers/fpga/fpga-mgr.c
>>>>> @@ -61,12 +61,14 @@ static inline int fpga_mgr_write_complete(struct
>>>>> fpga_manager *mgr, {
>>>>> int ret = 0;
>>>>>
>>>>> + mgr->err = 0;
>>>>> mgr->state = FPGA_MGR_STATE_WRITE_COMPLETE;
>>>>> if (mgr->mops->write_complete)
>>>>> ret = mgr->mops->write_complete(mgr, info);
>>>>> if (ret) {
>>>>> dev_err(&mgr->dev, "Error after writing image data to
>>>> FPGA\n");
>>>>> mgr->state = FPGA_MGR_STATE_WRITE_COMPLETE_ERR;
>>>>> + mgr->err = ret;
>>>>> return ret;
>>>>> }
>>>>> mgr->state = FPGA_MGR_STATE_OPERATING; @@ -154,6 +156,7 @@
>>>> static
>>>>> int fpga_mgr_parse_header_mapped(struct fpga_manager *mgr, {
>>>>> int ret;
>>>>>
>>>>> + mgr->err = 0;
>>>>> mgr->state = FPGA_MGR_STATE_PARSE_HEADER;
>>>>> ret = fpga_mgr_parse_header(mgr, info, buf, count);
>>>>>
>>>>> @@ -165,6 +168,7 @@ static int
>> fpga_mgr_parse_header_mapped(struct
>>>> fpga_manager *mgr,
>>>>> if (ret) {
>>>>> dev_err(&mgr->dev, "Error while parsing FPGA image
>>>> header\n");
>>>>> mgr->state = FPGA_MGR_STATE_PARSE_HEADER_ERR;
>>>>> + mgr->err = ret;
>>>>> }
>>>>>
>>>>> return ret;
>>>>> @@ -185,6 +189,7 @@ static int fpga_mgr_parse_header_sg_first(struct
>>>> fpga_manager *mgr,
>>>>> int ret;
>>>>>
>>>>> mgr->state = FPGA_MGR_STATE_PARSE_HEADER;
>>>>> + mgr->err = 0;
>>>>>
>>>>> sg_miter_start(&miter, sgt->sgl, sgt->nents, SG_MITER_FROM_SG);
>>>>> if (sg_miter_next(&miter) &&
>>>>> @@ -197,6 +202,7 @@ static int fpga_mgr_parse_header_sg_first(struct
>>>> fpga_manager *mgr,
>>>>> if (ret && ret != -EAGAIN) {
>>>>> dev_err(&mgr->dev, "Error while parsing FPGA image
>>>> header\n");
>>>>> mgr->state = FPGA_MGR_STATE_PARSE_HEADER_ERR;
>>>>> + mgr->err = ret;
>>>>> }
>>>>>
>>>>> return ret;
>>>>> @@ -249,6 +255,7 @@ static void *fpga_mgr_parse_header_sg(struct
>>>> fpga_manager *mgr,
>>>>> if (ret) {
>>>>> dev_err(&mgr->dev, "Error while parsing FPGA image
>>>> header\n");
>>>>> mgr->state = FPGA_MGR_STATE_PARSE_HEADER_ERR;
>>>>> + mgr->err = ret;
>>>>> kfree(buf);
>>>>> buf = ERR_PTR(ret);
>>>>> }
>>>>> @@ -272,6 +279,7 @@ static int fpga_mgr_write_init_buf(struct
>>>> fpga_manager *mgr,
>>>>> size_t header_size = info->header_size;
>>>>> int ret;
>>>>>
>>>>> + mgr->err = 0;
>>>>> mgr->state = FPGA_MGR_STATE_WRITE_INIT;
>>>>>
>>>>> if (header_size > count)
>>>>> @@ -284,6 +292,7 @@ static int fpga_mgr_write_init_buf(struct
>>>> fpga_manager *mgr,
>>>>> if (ret) {
>>>>> dev_err(&mgr->dev, "Error preparing FPGA for writing\n");
>>>>> mgr->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
>>>>> + mgr->err = ret;
>>>>> return ret;
>>>>> }
>>>>>
>>>>> @@ -370,6 +379,7 @@ static int fpga_mgr_buf_load_sg(struct
>>>>> fpga_manager *mgr,
>>>>>
>>>>> /* Write the FPGA image to the FPGA. */
>>>>> mgr->state = FPGA_MGR_STATE_WRITE;
>>>>> + mgr->err = 0;
>>>>> if (mgr->mops->write_sg) {
>>>>> ret = fpga_mgr_write_sg(mgr, sgt);
>>>>> } else {
>>>>> @@ -405,6 +415,7 @@ static int fpga_mgr_buf_load_sg(struct
>>>> fpga_manager *mgr,
>>>>> if (ret) {
>>>>> dev_err(&mgr->dev, "Error while writing image data to
>>>> FPGA\n");
>>>>> mgr->state = FPGA_MGR_STATE_WRITE_ERR;
>>>>> + mgr->err = ret;
>>>>> return ret;
>>>>> }
>>>>>
>>>>> @@ -437,10 +448,12 @@ static int fpga_mgr_buf_load_mapped(struct
>>>> fpga_manager *mgr,
>>>>> * Write the FPGA image to the FPGA.
>>>>> */
>>>>> mgr->state = FPGA_MGR_STATE_WRITE;
>>>>> + mgr->err = 0;
>>>>> ret = fpga_mgr_write(mgr, buf, count);
>>>>> if (ret) {
>>>>> dev_err(&mgr->dev, "Error while writing image data to
>>>> FPGA\n");
>>>>> mgr->state = FPGA_MGR_STATE_WRITE_ERR;
>>>>> + mgr->err = ret;
>>>>> return ret;
>>>>> }
>>>>>
>>>>> @@ -544,10 +557,11 @@ static int fpga_mgr_firmware_load(struct
>>>> fpga_manager *mgr,
>>>>> dev_info(dev, "writing %s to %s\n", image_name, mgr->name);
>>>>>
>>>>> mgr->state = FPGA_MGR_STATE_FIRMWARE_REQ;
>>>>> -
>>>>> + mgr->err = 0;
>>>>> ret = request_firmware(&fw, image_name, dev);
>>>>> if (ret) {
>>>>> mgr->state = FPGA_MGR_STATE_FIRMWARE_REQ_ERR;
>>>>> + mgr->err = ret;
>>>>> dev_err(dev, "Error requesting firmware %s\n",
>>>> image_name);
>>>>> return ret;
>>>>> }
>>>>> @@ -626,6 +640,10 @@ static ssize_t state_show(struct device *dev, {
>>>>> struct fpga_manager *mgr = to_fpga_manager(dev);
>>>>>
>>>>> + if (mgr->err)
>>>>> + return sprintf(buf, "%s: 0x%x\n",
>>>>> + state_str[mgr->state], mgr->err);
>>>>> +
>>>>> return sprintf(buf, "%s\n", state_str[mgr->state]);
>>>>
>>>>
>>>> If one of the fpga manager ops fails, the low-level error code is
>>>> already returned to the caller. Wouldn't it be better to rely on this
>>>> instead of printing the low-level error code in a sysfs attribute and sending
>> it to the userspace?
>>>>
>>> Agree, the low-level error code is already returned to the caller but
>>> the user application will not have any access to read this error info.
>>> So, I feel this patch provides that flexibility to the user application to get the
>> exact error info.
>>> please let me know if you have any other thoughts will implement that.
>>>
>>> Regards,
>>> Navakishore.
>>
>>
>> Hi Nava,
>>
>> Thanks for your quick reply. I understand the need to access the low-level
>> error code from userspace if the configuration goes wrong.
>>
>> However, in my understanding, the low-level driver is supposed to export
>> reconfiguration errors by implementing the status op and returning a bit field
>> set using the macros defined in fpga-mgr.h +189.
>> The fpga manager will, in turn, make the errors visible to userspace through
>> the status attribute. If the available error bits aren't descriptive enough,
>> wouldn't it be better to add more error macros instead of "overloading" the
>> state attribute?
>>
>> Moreover, it seems to me that if the reconfiguration is done by loading a
>> device tree overlay from userspace, the error code gets propagated back
>> through the notifier in of-fpga-region. Am I correct?
>>
>
> AFAIK The state and status interface use cases are different. The Status interface will provide the H/W error info.
> whereas the state interface provides the FPGA manager driver state(including Error strings). 
> Please Refer: Documentation/ABI/testing/sysfs-class-fpga-manager (for Error strings information).
>

I didn't say that state and status interfaces have the same use case.

Instead, I suggested that the state interface shouldn't be used to
report low-lever errors since this "responsibility" belongs to the
status interface.

The ABI documentation you cited states: "the intent [of the status
interface] is to provide more detailed information for FPGA
programming errors to userspace".

> With the existing implementation using DT-Overlay the Configuration/Reconfiguration lower-level
> driver errors are not propagating to userspace.
>
> Please correct me if my understanding is wrong.

Out of curiosity, I ran a couple of tests on the QEMU arm "virt"
platform using linux-xlnx and linux-socfpga kernels and the fake
FPGA manager from the KUnit RFC. I modified the fake manager to
fail the write op returning a specific error code.

static int op_write(struct fpga_manager ...)
{
[...]

return -202302;
}

The op error code is visible from the message log on both kernels when
the overlay application fails:

[ 23.958015] fpga_manager fpga0: writing fake.bin to Fake FPGA Manager
[ 23.959119] fpga_manager fpga0: Error while writing image data to FPGA
[ 23.959341] fpga_region region0: failed to load FPGA image
[ 23.959472] OF: overlay: overlay changeset pre-apply notifier error -202302, target: /test_region
[ 23.959622] create_overlay: Failed to create overlay (err=-202302)

However, I feel this part of the discussion is more speculative since
the DT overlay configFS interface is not mainline.

> Regards,
> Navakishore.
>

Thanks,
Marco