Re: [PATCH 3/3] ipmi:bt-bmc: Fix error handling and status check

From: Corey Minyard
Date: Fri Apr 17 2020 - 22:15:12 EST


On Wed, Apr 15, 2020 at 10:14:06AM +0800, Tang Bin wrote:
> Hi Corey:
>
> On 2020/4/15 4:18, Corey Minyard wrote:
> > On Tue, Apr 14, 2020 at 10:14:24PM +0800, Tang Bin wrote:
> > > If the function platform_get_irq() failed, the negative
> > > value returned will not be detected here. So fix error
> > > handling in bt_bmc_config_irq(). And if devm_request_irq()
> > > failed, 'bt_bmc->irq' is assigned to zero maybe redundant,
> > > it may be more suitable for using the correct negative values
> > > to make the status check in the function bt_bmc_remove().
> > Comments inline..
> >
> > > Signed-off-by: Tang Bin <tangbin@xxxxxxxxxxxxxxxxxxxx>
> > > Signed-off-by: Shengju Zhang <zhangshengju@xxxxxxxxxxxxxxxxxxxx>
> > > ---
> > > drivers/char/ipmi/bt-bmc.c | 12 +++++-------
> > > 1 file changed, 5 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c
> > > index 1d4bf5c65..1740c6dc8 100644
> > > --- a/drivers/char/ipmi/bt-bmc.c
> > > +++ b/drivers/char/ipmi/bt-bmc.c
> > > @@ -399,16 +399,14 @@ static int bt_bmc_config_irq(struct bt_bmc *bt_bmc,
> > > struct device *dev = &pdev->dev;
> > > int rc;
> > > - bt_bmc->irq = platform_get_irq(pdev, 0);
> > > - if (!bt_bmc->irq)
> > > - return -ENODEV;
> > > + bt_bmc->irq = platform_get_irq_optional(pdev, 0);
> > > + if (bt_bmc->irq < 0)
> > > + return bt_bmc->irq;
> For us, this part of modification have reached a consensus.
> > > rc = devm_request_irq(dev, bt_bmc->irq, bt_bmc_irq, IRQF_SHARED,
> > > DEVICE_NAME, bt_bmc);
> > > - if (rc < 0) {
> > > - bt_bmc->irq = 0;
> > > + if (rc < 0)
> > > return rc;
> > I don't think this part is correct. You will want to set bt_bmc->irq to
> > rc here to match what is done elsewhere so it's the error if negative.
>
> Nonono, I don't want to set bt_bmc->irq to rc, I think they are irrelevant.
>
> The logic of the previous code will continue to execute even if
> platform_get_irq() failed,which will be brought devm_request_irq() failed
> too. "bt_bmc->irq = 0" here is just for bt_bmc_remove() to execute
> del_timer_sync(). Otherwise the function del_timer_sync() will not execute
> if not set "bt_bmc->irq" to zero, because it's negative actually.

Sorry for the delay, I have had a lot of distractions.

The trouble is that the handling of bt_bmc->irq needs to be consistent.
Either it needs to be negative if the irq allocation fails, or it needs
to be zero if the irq allocation fails. I think it needs to be negative
because zero is a valid interrupt in some cases.

Consider the following code:

bt_bmc_config_irq(bt_bmc, pdev);

if (bt_bmc->irq) {
dev_info(dev, "Using IRQ %d\n", bt_bmc->irq);
} else {
dev_info(dev, "No IRQ; using timer\n");
timer_setup(&bt_bmc->poll_timer, poll_timer, 0);

If bt_bmc->irq is negative (if platform_get_irq_optional() fails), it
will say it's using the irq and won't start a timer and the driver won't
work. Then later (in your change below) it will try to stop the timer
even though it's not running.

If devm_request_irq() fails, then the interrupt is not set, but since
bt_bmc->irq is most likely not zero, it will not start the timer and the
driver won't work.

You really need to set bt_bmc->irq negative if it fails. And fix the
check above to be if (bt_bmc->irq >= 0).

-corey

>
>
> >
> > Also, I believe this function should no longer return an error. It
> > should just set the irq to the error if one happens. The driver needs
> > to continue to operate even if it can't get its interrupt.
> >
> > The rest of the changes are correct, I believe.
> >
> >
> > > - }
> > > /*
> > > * Configure IRQs on the bmc clearing the H2B and HBUSY bits;
> > > @@ -499,7 +497,7 @@ static int bt_bmc_remove(struct platform_device *pdev)
> > > struct bt_bmc *bt_bmc = dev_get_drvdata(&pdev->dev);
> > > misc_deregister(&bt_bmc->miscdev);
> > > - if (!bt_bmc->irq)
> > > + if (bt_bmc->irq < 0)
> > > del_timer_sync(&bt_bmc->poll_timer);
> > > return 0;
> > > }
>
> But now, the logic is: if the platform_get_irq_optional() failed, it returns
> immediately, the irq at this point is negative,the bt_bmc_probe() continue
> to operate. But in the function bt_bmc_remove(), we need status check in
> order to execute del_timer_sync(), so change "!bt_bmc->irq" to "bt_bmc->irq
> < 0".
>
> So, when the judgment of "bt_bmc->irq" in the function bt_bmc_remove() goes
> back to the original negative value, the "bt_bmc->irq = 0" in the line 410
> become redundant. That's why I remove it.
>
>
>
> I am very glad to communicate and discuss with you these days.
>
> Thanks,
>
> Tang Bin
>
>
> > >
> > >
> > >
>
>