Re: [PATCH, v4] hwmon: coretemp: use list instead of fixed sizearray for temp data

From: Guenter Roeck
Date: Wed May 09 2012 - 06:34:33 EST


On Wed, May 09, 2012 at 06:16:34AM -0400, Kirill A. Shutemov wrote:
> On Wed, May 09, 2012 at 02:56:17AM -0700, Guenter Roeck wrote:
> > On Wed, May 09, 2012 at 03:23:39AM -0400, Kirill A. Shutemov wrote:
> > > On Wed, May 09, 2012 at 10:09:06AM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, May 08, 2012 at 09:39:40AM -0700, Guenter Roeck wrote:
> > > > > On Tue, 2012-05-08 at 06:49 -0400, Kirill A. Shutemov wrote:
> > > > > > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
> > > > > >
> > > > > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > > > > limited by hardcoded array size.
> > > > > >
> > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > > > > > ---
> > > > > > v4:
> > > > > > - address issues pointed by Guenter Roeck;
> > > > > > v3:
> > > > > > - drop redundant refcounting and checks;
> > > > > > v2:
> > > > > > - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > > > > > - use mutex instead of spinlock for list locking.
> > > > > > ---
> > > > >
> > > > > Hi Kirill,
> > > > >
> > > > > unfortunately now we have another race condition :(. See below ...
> > > >
> > > > Ughh..
> > > >
> > > > > > @@ -557,11 +579,22 @@ exit_free:
> > > > > > static int __devexit coretemp_remove(struct platform_device *pdev)
> > > > > > {
> > > > > > struct platform_data *pdata = platform_get_drvdata(pdev);
> > > > > > - int i;
> > > > > > + struct temp_data *tdata;
> > > > > >
> > > > > > - for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> > > > > > - if (pdata->core_data[i])
> > > > > > - coretemp_remove_core(pdata, &pdev->dev, i);
> > > > > > + for (;;) {
> > > > > > + mutex_lock(&pdata->temp_data_lock);
> > > > > > + if (!list_empty(&pdata->temp_data_list)) {
> > > > > > + tdata = list_first_entry(&pdata->temp_data_list,
> > > > > > + struct temp_data, list);
> > > > > > + list_del(&tdata->list);
> > > > > > + } else
> > > > > > + tdata = NULL;
> > > > > > + mutex_unlock(&pdata->temp_data_lock);
> > > > > > +
> > > > > > + if (!tdata)
> > > > > > + break;
> > > > > > + coretemp_remove_core(tdata, &pdev->dev);
> > > > > > + }
> > > > > >
> > > > > Unfortunately, that results in a race condition, since the tdata list
> > > > > entry is gone before the attribute file is deleted.
> > > > >
> > > > > I think you can still use list_for_each_entry_safe, only outside the
> > > > > mutex, and remove the list entry at the end of coretemp_remove_core()
> > >
> > > I haven't got how list_for_each_entry_safe() will be really safe without
> > > the lock.
> > >
> > We know that it by itself won't be called multiple times. So the only question is
> > if the functions to add/remove a core can be called while coretemp_remove is called,
> > or if that is mutually exclusive (not that the current code handles this case).
> >
> > Fortunately, there is a function to block CPU removal/insertion: get_online_cpus()
> > and put_online_cpus(). I have no idea if it is necessary to protect coretemp_remove()
> > with it, but it might be on the safe side anyway.
> >
> > > > > after deleting the attribute file. Just keep the code as it was, and
> > > > > remove the list entry (mutex-protected) where core_data[] was set to
> > > > > NULL.
> > > >
> > > > I think
> > > >
> > > > if (tdata)
> > > > return -ENODEV;
> > > >
> > > > in show methods will fix the issue. Right?
> > >
> > > It won't. Stupid me.
> > >
> > > But the check + kref seems will work...
> > >
> > Yes, but would be way too complicated.
>
> More code, yes, but complicated? What you propose looks like a trick. It
> has too many assumptions on context.
>

There is an even better solution: unregistering the hotplug notifier
before removing the driver. And, as you will notice, that is already done.
So list_for_each_entry_safe() is safe after all, since no other remove/add
activity will occur at the same time.

> I personally prefer kref since it's straight forward and more friendly for
> future changes.
>
Guess we have to agree to disagree on that one.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/