RE: [PATCH 2/2] Drivers: hv: balloon: Fix the deadlock issue in the memory hot-add code

From: KY Srinivasan
Date: Sun Dec 14 2014 - 13:45:31 EST




> -----Original Message-----
> From: Michal Hocko [mailto:mhocko@xxxxxxx]
> Sent: Thursday, December 11, 2014 4:58 AM
> To: KY Srinivasan
> Cc: Yasuaki Ishimatsu; gregkh@xxxxxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx;
> apw@xxxxxxxxxxxxx; linux-mm@xxxxxxxxx
> Subject: Re: [PATCH 2/2] Drivers: hv: balloon: Fix the deadlock issue in the
> memory hot-add code
>
> On Thu 11-12-14 00:21:09, KY Srinivasan wrote:
> >
> >
> > > -----Original Message-----
> > > From: Michal Hocko [mailto:mhocko@xxxxxxx]
> > > Sent: Tuesday, December 9, 2014 2:56 AM
> > > To: Yasuaki Ishimatsu
> > > Cc: KY Srinivasan; gregkh@xxxxxxxxxxxxxxxxxxx; linux-
> > > kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx;
> > > olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; linux-mm@xxxxxxxxx
> > > Subject: Re: [PATCH 2/2] Drivers: hv: balloon: Fix the deadlock
> > > issue in the memory hot-add code
> > >
> > > On Tue 09-12-14 19:25:50, Yasuaki Ishimatsu wrote:
> > > > (2014/12/09 18:08), Michal Hocko wrote:
> > > [...]
> > > > >Doesn't udev retry the operation if it gets EBUSY or EAGAIN?
> > > >
> > > > It depend on implementation of udev.rules. So we can retry
> > > > online/offline operation in udev.rules.
> > > [...]
> > >
> > > # Memory hotadd request
> > > SUBSYSTEM=="memory", ACTION=="add",
> > > DEVPATH=="/devices/system/memory/memory*[0-9]",
> > > TEST=="/sys$devpath/state", RUN+="/bin/sh -c 'echo online >
> > > /sys$devpath/state'"
> > >
> > > OK so this is not prepared for a temporary failures and retries.
> > >
> > > > >And again, why cannot we simply make the onlining fail or
> > > > >try_lock and retry internally if the event consumer cannot cope with
> errors?
> > > >
> > > > Did you mean the following Srinivasan's first patch looks good to you?
> > > > https://lkml.org/lkml/2014/12/2/662
> > >
> > > Heh, I was just about to post this. Because I haven't noticed the
> > > previous patch yet. Yeah, Something like that. Except that I would
> > > expect EAGAIN or EBUSY rather than ERESTARTSYS which should never
> > > leak into userspace. And that would happen here AFAICS because
> > > signal_pending will not be true usually.
> > Michal,
> >
> > I agree that the fix to this problem must be outside the clients of
> > add_memory() and that is the reason I had sent that patch:
> > https://lkml.org/lkml/2014/12/2/662. Let me know if you want me to
> > resend this patch with the correct return value.
>
> Please think about the other suggested options as well.

Thanks Michal. I will look at the other options you have listed as well.

K. Y
>
> > Regards,
> >
> > K. Y
> > >
> > > So there are two options. Either make the udev rule more robust and
> > > retry within RUN section or do the retry withing online_pages
> > > (try_lock and go into interruptible sleep which gets signaled by
> > > finished add_memory()). The later option is safer wrt. the userspace
> > > because the operation wouldn't fail unexpectedly.
> > > Another option would be generating the sysfs file after all the
> > > internal initialization is done and call it outside of the memory hotplug
> lock.
> > >
> > > --
> > > Michal Hocko
> > > SUSE Labs
>
> --
> Michal Hocko
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/