Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks

From: Rafael J. Wysocki
Date: Thu Feb 14 2013 - 15:11:12 EST


On Thursday, February 14, 2013 12:05:43 PM Yinghai Lu wrote:
> On Wed, Feb 13, 2013 at 5:16 AM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > This changeset is aimed at fixing a few different but related
> > problems in the ACPI hotplug infrastructure.
> >
> > First of all, since notify handlers may be run in parallel with
> > acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> > and some of them are installed for ACPI handles that have no struct
> > acpi_device objects attached (i.e. before those objects are created),
> > those notify handlers have to take acpi_scan_lock to prevent races
> > from taking place (e.g. a struct acpi_device is found to be present
> > for the given ACPI handle, but right after that it is removed by
> > acpi_bus_trim() running in parallel to the given notify handler).
> > Moreover, since some of them call acpi_bus_scan() and
> > acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> > should be acquired by the callers of these two funtions rather by
> > these functions themselves.
> >
> > For these reasons, make all notify handlers that can handle device
> > addition and eject events take acpi_scan_lock and remove the
> > acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> > Accordingly, update all of their users to make sure that they
> > are always called under acpi_scan_lock.
> >
> > Furthermore, since eject operations are carried out asynchronously
> > with respect to the notify events that trigger them, with the help
> > of acpi_bus_hot_remove_device(), even if notify handlers take the
> > ACPI scan lock, it still is possible that, for example,
> > acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> > the notify handler that scheduled its execution and that
> > acpi_bus_trim() will remove the device node passed to
> > acpi_bus_hot_remove_device() for ejection. In that case, the struct
> > acpi_device object obtained by acpi_bus_hot_remove_device() will be
> > invalid and not-so-funny things will ensue. To protect agaist that,
> > make the users of acpi_bus_hot_remove_device() run get_device() on
> > ACPI device node objects that are about to be passed to it and make
> > acpi_bus_hot_remove_device() run put_device() on them and check if
> > their ACPI handles are not NULL (make acpi_device_unregister() clear
> > the device nodes' ACPI handles for that check to work).
> >
> > Finally, observe that acpi_os_hotplug_execute() actually can fail,
> > in which case its caller ought to free memory allocated for the
> > context object to prevent leaks from happening. It also needs to
> > run put_device() on the device node that it ran get_device() on
> > previously in that case. Modify the code accordingly.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > Acked-by: Yinghai Lu <yinghai@xxxxxxxxxx>
> > ---
> >
> > This includes fixes for two issues spotted by Yasuaki Ishimatsu.
> >
>
> this one will make pci/next and pm/linux-next conflicts
>
> Please check if attached fix is right.

Looks correct to me.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/