RE: [PATCH] tpm: don't destroy chip device prematurely

From: Winkler, Tomas
Date: Fri Oct 07 2016 - 16:10:54 EST


> Subject: Re: [PATCH] tpm: don't destroy chip device prematurely
>
> On Fri, Oct 07, 2016 at 02:24:59PM +0000, Winkler, Tomas wrote:
>
> > So here I'm to say I'm sorry for misleading this, after all the doubts
> > I got back to debugging and traces. One thing for a reason moving the
> > device_del, had really made the problem go away, but the real problem
> > was unbalance runtime_pm PUT/GET from the tpm_crb probe function.
>
> Oh this is very good news, I'm glad this was resolved in crb!
>
> Presumably the unbalanced put made the ref count go negative and the
> balanced get caused it to go to zero, so pm locking was basically totally
> broken? That would explain how an idle callback could run concurrently with
> transmit_cmd.

This is not due to locking and refcount, but similar. The usage_count went negative and the idle callback kicked in from the pm work queue, and suspended the device.

>
> Though a bit of a mystery why device_del had any impact? I'm still very
> unclear exactly how the child device effects the parent - and that seems like
> pretty important information going forward..

Yes, there is some dependency as if device_del is not called the idle callback doesn't kick in between send and receive and that was misleading. I'm not sure but this could be due to scheduling of the pm worker, but I'm not sure. In any case we hit the issue even w/o device_del if the device is exercise enough. I will dig into that later.

Thanks
Tomas