RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure

From: Dexuan Cui
Date: Thu Nov 27 2014 - 06:44:58 EST


> -----Original Message-----
> From: Jason Wang [mailto:jasowang@xxxxxxxxxx]
> Sent: Thursday, November 27, 2014 17:01 PM
> To: Dexuan Cui
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; driverdev-
> devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; KY
> Srinivasan; vkuznets@xxxxxxxxxx; Haiyang Zhang
> Subject: RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer
> failure
> On Thu, Nov 27, 2014 at 4:50 PM, Dexuan Cui <decui@xxxxxxxxxxxxx> wrote:
> >> -----Original Message-----
> >> From: Jason Wang [mailto:jasowang@xxxxxxxxxx]
> >> Sent: Thursday, November 27, 2014 15:15 PM
> >> To: Dexuan Cui
> >> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >> driverdev-
> >> devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; KY
> >> Srinivasan; vkuznets@xxxxxxxxxx; Haiyang Zhang
> >> Subject: Re: [PATCH v2] hv: hv_fcopy: drop the obsolete message on
> >> transfer
> >> failure
> >> ----- Original Message -----
> >> > In the case the user-space daemon crashes, hangs or is killed, we
> >> > need to down the semaphore, otherwise, after the daemon starts
> >> next
> >> > time, the obsolete data in fcopy_transaction.message or
> >> > fcopy_transaction.fcopy_msg will be used immediately.
> >> >
> >> > Reviewed-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> >> > Cc: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
> >> > Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> >> > ---
> >> >
> >> > v2: I removed the "FCP" prefix as Greg asked.
> >> >
> >> > I also updated the output message a little:
> >> > "FCP: failed to acquire the semaphore" -->
> >> > "can not acquire the semaphore: it is benign"
> >> >
> >> > drivers/hv/hv_fcopy.c | 9 +++++++++
> >> > 1 file changed, 9 insertions(+)
> >> >
> >> > diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
> >> > index 23b2ce2..c518ad9 100644
> >> > --- a/drivers/hv/hv_fcopy.c
> >> > +++ b/drivers/hv/hv_fcopy.c
> >> > @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct
> >> *dummy)
> >> > * process the pending transaction.
> >> > */
> >> > fcopy_respond_to_host(HV_E_FAIL);
> >> > +
> >> > + /* In the case the user-space daemon crashes, hangs or is
> >> killed, we
> >> > + * need to down the semaphore, otherwise, after the daemon
> >> starts
> >> next
> >> > + * time, the obsolete data in fcopy_transaction.message or
> >> > + * fcopy_transaction.fcopy_msg will be used immediately.
> >> > + */
> >>
> >> Looks still racy, what happens if the daemon start before
> >> down_trylock()
> >> but after fcopy_respont_to_host() here?
> > Jason,
> > Thanks for pointing this out!
> > IMO we can resolve this by adding down_trylock() in fcopy_release().
> > What's your opinion?
>
>
> Looks better and need to cancel the timeout also here?
OK, let me post a v3.

> >
> >
> >>
> >> > + if (down_trylock(&fcopy_transaction.read_sema))
> >> > + pr_debug("can not acquire the semaphore: it is benign\n");
> >>
> >> typo
> >> > +
> >> > }
> > Sorry -- what typo do you mean?
>
> s/benign/begin/ ?
I meant the issue(can't get the semaphore) is benign.

I think we can just remove the message, as KY suggested.
Instead, I'll add a comment for it.

Thanks,
-- Dexuan

N‹§²æ¸›yú²X¬¶ÇvØ–)Þ{.nlj·¥Š{±‘êX§¶›¡Ü}©ž²ÆzÚj:+v‰¨¾«‘êZ+€Êzf£¢·hšˆ§~†­†Ûÿû®w¥¢¸?™¨è&¢)ßf”ùy§m…á«a¶Úÿ 0¶ìå