Re: [PATCH] mm/hotplug: Remove stop_machine() from try_offline_node()

From: KOSAKI Motohiro
Date: Wed Aug 14 2013 - 21:21:37 EST


(8/12/13 3:34 PM), Toshi Kani wrote:
> lock_device_hotplug() serializes hotplug & online/offline operations.
> The lock is held in common sysfs online/offline interfaces and ACPI
> hotplug code paths.
>
> try_offline_node() off-lines a node if all memory sections and cpus
> are removed on the node. It is called from acpi_processor_remove()
> and acpi_memory_remove_memory()->remove_memory() paths, both of which
> are in the ACPI hotplug code.
>
> try_offline_node() calls stop_machine() to stop all cpus while checking
> all cpu status with the assumption that the caller is not protected from
> CPU hotplug or CPU online/offline operations. However, the caller is
> always serialized with lock_device_hotplug(). Also, the code needs to
> be properly serialized with a lock, not by stopping all cpus at a random
> place with stop_machine().
>
> This patch removes the use of stop_machine() in try_offline_node() and
> adds comments to try_offline_node() and remove_memory() that
> lock_device_hotplug() is required.

This patch need more verbose explanation. check_cpu_on_node() traverse cpus
and cpu hotplug seems to use cpu_hotplug_driver_lock() instead of lock_device_hotplug().

That said, the race is not happen against another memeory happen. It's likely happen
another cpu hotplug. So commenting remove_memory() doesn't make much sense.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/