RE: [PATCH 3/3] Thermal: do thermal zone update after a cooling device registered

From: Chen, Yu C
Date: Wed Oct 14 2015 - 15:21:30 EST


Hi Javi,


> -----Original Message-----
> From: Javi Merino [mailto:javi.merino@xxxxxxx]
> Sent: Thursday, October 15, 2015 1:08 AM
> To: Chen, Yu C
> Cc: linux-pm@xxxxxxxxxxxxxxx; edubezval@xxxxxxxxx; Zhang, Rui; linux-
> kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx; Pandruvada, Srinivas
> Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> device registered
>
> On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:
> > Hi, Javi
> > Sorry for my late response,
> >
> > > -----Original Message-----
> > > From: Javi Merino [mailto:javi.merino@xxxxxxx]
> > > Sent: Wednesday, September 30, 2015 12:02 AM
> > > To: Chen, Yu C
> > > Cc: linux-pm@xxxxxxxxxxxxxxx; edubezval@xxxxxxxxx; Zhang, Rui;
> > > linux- kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > cooling device registered
> > >
> > > Hi Yu,
> > >
> > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > > Hi, Javi,
> > > >
> > > > > -----Original Message-----
> > > > > From: Javi Merino [mailto:javi.merino@xxxxxxx]
> > > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > > To: Chen, Yu C
> > > > > Cc: linux-pm@xxxxxxxxxxxxxxx; edubezval@xxxxxxxxx; Zhang, Rui;
> > > > > linux- kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > > cooling device registered
> > > > >
> > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > > From: Zhang Rui <rui.zhang@xxxxxxxxx>
> > > > > >
> > > > > >
> > > > >
> > > > > I think you need to hold cdev->lock here, to make sure that no
> > > > > thermal zone is added or removed from cdev->thermal_instances
> > > > > while
> > > you are looping.
> > > > >
> > > > Ah right, will add. If I add the cdev ->lock here, will there be a
> > > > AB-BA lock with thermal_zone_unbind_cooling_device?
> > >
> > > You're right, it could lead to a deadlock. The locks can't be
> > > swapped because that won't work in step_wise.
> > >
> > > The best way that I can think of accessing thermal_instances
> > > atomically is by making it RCU protected instead of with mutexes.
> > > What do you think?
> > >
> > RCU would need extra spinlocks to protect the list, and need to
> > sync_rcu after we delete one instance from thermal_instance list, I
> > think it is too complicated for me to rewrite: ( How about using
> thermal_list_lock instead of cdev ->lock?
> > This guy should be big enough to protect the device.thermal_instance list.
>
> thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but it
> doesn't protect the thermal_instances list. For example,
> thermal_zone_bind_cooling_device() adds a cooling device to the
> cdev->thermal_instances list without taking thermal_tz_list.
>

Before thermal_zone_bind_cooling_device is invoked,
the thermal_list_lock will be firstly gripped:

static void bind_cdev(struct thermal_cooling_device *cdev)
{

mutex_lock(&thermal_list_lock);

either tz->ops->bind : thermal_zone_bind_cooling_device

or __bind() : thermal_zone_bind_cooling_device
mutex_unlock(&thermal_list_lock);

}

And it is the same as in passive_store.

So when code is trying to add/delete thermal_instance of cdev, he has
already hold thermal_list_lock IMO. Or do I miss anything?

Best Regards,
Yu


N‹§²æ¸›yú²X¬¶ÇvØ–)Þ{.nlj·¥Š{±‘êX§¶›¡Ü}©ž²ÆzÚj:+v‰¨¾«‘êZ+€Êzf£¢·hšˆ§~†­†Ûÿû®w¥¢¸?™¨è&¢)ßf”ùy§m…á«a¶Úÿ 0¶ìå