Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues andmemory leaks

From: Toshi Kani
Date: Thu Feb 14 2013 - 19:39:10 EST


On Fri, 2013-02-15 at 01:23 +0100, Rafael J. Wysocki wrote:
> On Thursday, February 14, 2013 11:45:27 PM Moore, Robert wrote:
> >
> > > -----Original Message-----
> > > From: Rafael J. Wysocki [mailto:rjw@xxxxxxx]
> > > Sent: Thursday, February 14, 2013 12:59 PM
> > > To: Moore, Robert
> > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang Liu;
> > > Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe; linux-pci@xxxxxxxxxxxxxxx
> > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and
> > > memory leaks
> > >
> > > On Thursday, February 14, 2013 08:45:14 PM Moore, Robert wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Rafael J. Wysocki [mailto:rjw@xxxxxxx]
> > > > > Sent: Thursday, February 14, 2013 4:04 AM
> > > > > To: Moore, Robert
> > > > > Cc: Toshi Kani; ACPI Devel Maling List; LKML; Bjorn Helgaas; Jiang
> > > > > Liu; Yinghai Lu; Yasuaki Ishimatsu; Myron Stowe;
> > > > > linux-pci@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues
> > > > > and memory leaks
> > > > >
> > > > > On Thursday, February 14, 2013 02:31:22 AM Moore, Robert wrote:
> > > > > > > > > I thought about that, but actually there's no guarantee that
> > > > > > > > > the handle will be valid after _EJ0 as far as I can say. So
> > > > > > > > > the race condition is going to be there anyway and using
> > > > > > > > > struct acpi_device just makes it easier to avoid it.
> > > > > > > >
> > > > > > > > In theory, yes, a stale handle could be a problem, if _EJ0
> > > > > > > > performs unload table and if ACPICA frees up its internal data
> > > > > > > > structure pointed by the handle as a result. But we should
> > > > > > > > not see such issue now since we do not support dynamic ACPI
> > > > > > > > namespace
> > > > > yet.
> > > > > > >
> > > > > > > I'm waiting for information from Bob about that. If we can
> > > > > > > assume ACPI handles to be always valid, that will simplify
> > > > > > > things quite a
> > > > > bit.
> > > > > >
> > > > > > If a table is unloaded, all the namespace nodes for that table are
> > > > > > removed from the namespace, and thus any ACPI_HANDLE pointers go
> > > > > > stale
> > > > > and invalid.
> > > > >
> > > > > OK, thanks!
> > > > >
> > > > > To me this means that we cannot assume a handle to stay valid
> > > > > between a notify handler and acpi_bus_hot_remove_device() run from a
> > > workqueue.
> > > > >
> > > > > Is there a mechanism in ACPICA to ensure that a handle won't become
> > > > > stale while a notify handler is running for it or is the OS
> > > > > responsible for ensuring that
> > > > > _EJ0 won't be run in parallel with notify handlers for device
> > > > > objects being ejected?
> > > > >
> > > >
> > > > It is up to the host.
> > >
> > > I was afraid that that might be the case. :-)
> > >
> > > So far the (Linux) host has been happily ignoring that potential problem,
> > > so I guess it can still be ignored for a while, although we'll need to
> > > address it eventually at one point.
> >
> > I would think it should be fairly simple to setup a mechanism to either tell
> > the driver or for the driver to figure it out -- such that the driver knows
> > that all handles associated with the device are now invalid. Another way
> > to look at it is that when the device is re-installed, the driver should
> > reinitialize such that it obtains new handles for the devices and subobjects
> > in question.
>
> Unfortunately, there is quite strong assumption in our code that ACPI handles
> will not become stale before the device objects associated with them are
> removed. For this reason, we need to know in advance which handles will
> become stale as a result of a table unload and remove their device objects
> beforehand.
>
> Moreover, when there's a notify handler installed for a given ACPI handle
> and that handle becomes stale while the notify handler is running, we'll be
> in trouble. To avoid that we need to ensure that table unloads and notifies
> will always be mutually exclusive.

I wonder if we can make acpi_ns_validate_handle() to actually be able to
verify if a given handle is valid. This way, ACPICA can fail gracefully
(AE_BAD_PARAMETER) when a stable handle is passed to the interfaces.

Thanks,
-Toshi



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/