Re: [PATCH 1/2][Untested] ACPI / hotplug: Add demand_offline hotplug profile flag

From: Rafael J. Wysocki
Date: Fri Dec 27 2013 - 06:39:18 EST


On Friday, December 27, 2013 02:34:52 PM Yasuaki Ishimatsu wrote:
> (2013/12/27 14:18), Yasuaki Ishimatsu wrote:
> > (2013/12/27 9:58), Rafael J. Wysocki wrote:
> >> On Thursday, December 26, 2013 01:10:30 PM Yasuaki Ishimatsu wrote:
> >>> (2013/12/26 12:10), Yasuaki Ishimatsu wrote:
> >>>> (2013/12/23 23:00), Rafael J. Wysocki wrote:
> >>>>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >>>>>
> >>>>> Add a new ACPI hotplug profile flag, demand_offline, such that if
> >>>>> set for the given ACPI device object's scan handler, it will cause
> >>>>> acpi_scan_hot_remove() to check if that device object's physical
> >>>>> companions are offline upfront and fail the hot removal if that
> >>>>> is not the case.
> >>>>>
> >>>>> That flag will be useful to overcome a problem with containers on
> >>>>> some system where they can only be hot-removed after some cleanup
> >>>>> operations carried out by user space, which needs to be notified
> >>>>> of the container hot-removal before the kernel attempts to offline
> >>>>> devices in the container. In those cases the current implementation
> >>>>> of acpi_scan_hot_remove() is not sufficient, because it first tries
> >>>>> to offline the devices in the container and only if that is
> >>>>> suffcessful it tries to offline the container itself. As a result,
> >>>>> the container hot-removal notification is not delivered to user space
> >>>>> at the right time.
> >>>>>
> >>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >>>>> ---
> >>>>> drivers/acpi/scan.c | 41 +++++++++++++++++++++++++++++++++++++----
> >>>>> include/acpi/acpi_bus.h | 3 ++-
> >>>>> 2 files changed, 39 insertions(+), 5 deletions(-)
> >>>>>
> >>>>> Index: linux-pm/drivers/acpi/scan.c
> >>>>> ===================================================================
> >>>>> --- linux-pm.orig/drivers/acpi/scan.c
> >>>>> +++ linux-pm/drivers/acpi/scan.c
> >>>>> @@ -126,6 +126,24 @@ acpi_device_modalias_show(struct device
> >>>>> }
> >>>>> static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
> >>>>>
> >>>>> +static bool acpi_scan_is_offline(struct acpi_device *adev)
> >>>>> +{
> >>>>> + struct acpi_device_physical_node *pn;
> >>>>> + bool offline = true;
> >>>>> +
> >>>>> + mutex_lock(&adev->physical_node_lock);
> >>>>> +
> >>>>> + list_for_each_entry(pn, &adev->physical_node_list, node)
> >>>>
> >>>>> + if (!pn->dev->offline) {
> >>>>
> >>>
> >>>> Please check pn->dev->bus and pn->dev->bus->offline too as follow:
> >>>>
> >>>> if (pn->dev->bus && pn->dev->bus->offline &&
> >>>> !pn->dev->offline) {
> >>>>
> >>>
> >>> Adding above check, I could remove container device by using eject sysfs.
> >>> But following messages were shown:
> >>
> >> Well, it looks like I have overlooked that error during my testing.
> >>
> >>> [ 1017.543000] ------------[ cut here ]------------
> >>> [ 1017.543000] WARNING: CPU: 0 PID: 6 at drivers/base/core.c:251 device_release+0x92/0xa0()
> >>> [ 1017.543000] Device 'ACPI0004:01' does not have a release() function, it is broken and must be fixed.
> >>> [ 1017.653000] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipmi_devintf ipt_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
> >>> ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter sg ip_tables vfat fat x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support kvm_intel
> >>> kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd dm_mirror dm_region_hash dm_log dm_mod microcode lpc_ich igb sb_edac e1000e pcspkr i2c_i801
> >>> [ 1017.653000] edac_core mfd_core dca ptp pps_core shpchp ipmi_si ipmi_msghandler tpm_infineon nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod mgag200 syscopyarea sysfillrect sysimgblt lpfc i2c_algo_bit drm_kms_helper ttm drm crc_t10dif crct10dif_common scsi_transport_fc megaraid_sas
> >>> i2c_core scsi_tgt
> >>> [ 1017.653000] CPU: 0 PID: 6 Comm: kworker/u512:0 Tainted: G W 3.13.0-rc5+ #5
> >>> [ 1017.653000] Hardware name:
> >>> [ 1017.653000] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> >>> [ 1017.653000] 0000000000000009 ffff880873a6dc68 ffffffff815d85ca ffff880873a6dcb0
> >>> [ 1017.653000] ffff880873a6dca0 ffffffff8106594d ffff8a07d221c010 ffff8a07d221c000
> >>> [ 1017.653000] ffff8808715472c0 ffff880871e91018 0000000000000103 ffff880873a6dd00
> >>> [ 1017.653000] Call Trace:
> >>> [ 1017.653000] [<ffffffff815d85ca>] dump_stack+0x45/0x56
> >>> [ 1017.653000] [<ffffffff8106594d>] warn_slowpath_common+0x7d/0xa0
> >>> [ 1017.653000] [<ffffffff810659bc>] warn_slowpath_fmt+0x4c/0x50
> >>> [ 1017.653000] [<ffffffff813ad892>] device_release+0x92/0xa0
> >>> [ 1017.653000] [<ffffffff812b7197>] kobject_cleanup+0x77/0x1b0
> >>> [ 1017.653000] [<ffffffff812b7045>] kobject_put+0x35/0x70
> >>> [ 1017.653000] [<ffffffff813ae38c>] device_unregister+0x2c/0x60
> >>> [ 1017.653000] [<ffffffff8134c87c>] container_device_detach+0x28/0x2a
> >>> [ 1017.653000] [<ffffffff81323096>] acpi_bus_trim+0x56/0x89
> >>> [ 1017.653000] [<ffffffff813246ae>] acpi_device_hotplug+0x168/0x383
> >>> [ 1017.653000] [<ffffffff8131efba>] acpi_hotplug_work_fn+0x1c/0x27
> >>> [ 1017.653000] [<ffffffff81080f1b>] process_one_work+0x17b/0x460
> >>> [ 1017.653000] [<ffffffff81081ccb>] worker_thread+0x11b/0x400
> >>> [ 1017.653000] [<ffffffff81081bb0>] ? rescuer_thread+0x3e0/0x3e0
> >>> [ 1017.653000] [<ffffffff81088a12>] kthread+0xd2/0xf0
> >>> [ 1017.653000] [<ffffffff81088940>] ? kthread_create_on_node+0x180/0x180
> >>> [ 1017.653000] [<ffffffff815e82fc>] ret_from_fork+0x7c/0xb0
> >>> [ 1017.653000] [<ffffffff81088940>] ? kthread_create_on_node+0x180/0x180
> >>> [ 1017.653000] ---[ end trace 41394323eb4b690a ]---
> >>
> >> Below is an updated version of patch [2/2] that should fix this problem (and
> >> the other one with the PCI host bridge not supporting offline too).
> >
> > By updated patch, I can offline the container device and the above messages
> > disappeared.
> >
> > BTW, when I hot remove PCI root bridge, following messages were shown:
> >
> > ...
> > [ 116.758000] ------------[ cut here ]------------
> > [ 116.758000] WARNING: CPU: 0 PID: 6 at fs/sysfs/group.c:214 sysfs_remove_group+0xc6/0xd0()
> > [ 116.758000] sysfs group ffffffff819ac5c0 not found for kobject '0001:ff:10.2'
> > [ 116.758000] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipmi_devintf ipt_REJECT xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
> > ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables sg vfat fat x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul
> > crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd dm_mirror dm_region_hash e1000e dm_log dm_mod iTCO_wdt iTCO_vendor_support microcode sb_edac igb pcspkr edac_core i2c_i801
> > [ 116.758000] ptp lpc_ich pps_core dca mfd_core shpchp ipmi_si tpm_infineon ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper lpfc ttm drm crc_t10dif crct10dif_common scsi_transport_fc megaraid_sas
> > i2c_core scsi_tgt
> > [ 116.758000] CPU: 0 PID: 6 Comm: kworker/u512:0 Tainted: G W 3.13.0-rc5+ #11
> > [ 116.758000] Hardware name:
> > [ 116.758000] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> > [ 116.758000] 0000000000000009 ffff8808738d3bd8 ffffffff815d84ea ffff8808738d3c20
> > [ 116.758000] ffff8808738d3c10 ffffffff8106594d 0000000000000000 ffffffff819ac5c0
> > [ 116.758000] ffff880871b9d0a8 ffff8a07d1895000 0000000000000103 ffff8808738d3c70
> > [ 116.758000] Call Trace:
> > [ 116.758000] [<ffffffff815d84ea>] dump_stack+0x45/0x56
> > [ 116.758000] [<ffffffff8106594d>] warn_slowpath_common+0x7d/0xa0
> > [ 116.758000] [<ffffffff810659bc>] warn_slowpath_fmt+0x4c/0x50
> > [ 116.758000] [<ffffffff8122b52e>] ? sysfs_get_dirent_ns+0x4e/0x70
> > [ 116.758000] [<ffffffff8122c806>] sysfs_remove_group+0xc6/0xd0
> > [ 116.758000] [<ffffffff813b83f3>] dpm_sysfs_remove+0x43/0x50
> > [ 116.758000] [<ffffffff813ae105>] device_del+0x45/0x1c0
> > [ 116.758000] [<ffffffff812e51f6>] pci_remove_bus_device+0x66/0xd0
> > [ 116.758000] [<ffffffff812e5363>] pci_remove_root_bus+0x73/0x80
> > [ 116.758000] [<ffffffff813276ab>] acpi_pci_root_remove+0x42/0x4f
> > [ 116.758000] [<ffffffff81323070>] acpi_bus_trim+0x56/0x89
> > [ 116.758000] [<ffffffff81323052>] acpi_bus_trim+0x38/0x89
> > [ 116.758000] [<ffffffff813245df>] acpi_device_hotplug+0x137/0x33b
> > [ 116.758000] [<ffffffff8131efba>] acpi_hotplug_work_fn+0x1c/0x27
> > [ 116.758000] [<ffffffff81080f1b>] process_one_work+0x17b/0x460
> > [ 116.758000] [<ffffffff81081ccb>] worker_thread+0x11b/0x400
> > [ 116.758000] [<ffffffff81081bb0>] ? rescuer_thread+0x3e0/0x3e0
> > [ 116.758000] [<ffffffff81088a12>] kthread+0xd2/0xf0
> > [ 116.758000] [<ffffffff81088940>] ? kthread_create_on_node+0x180/0x180
> > [ 116.758000] [<ffffffff815e823c>] ret_from_fork+0x7c/0xb0
> > [ 116.758000] [<ffffffff81088940>] ? kthread_create_on_node+0x180/0x180
> > [ 116.758000] ---[ end trace b403db9d0ec9fc9e ]---
> > ...
> >
> > It is a know issue? The message are shown when we use latest bleeding-edge.
> >
> > Thanks,
> > Yasuaki Ishimatsu
>
>
> I'll have new year vacation starting tomorrow. Thus I'll review and test
> the updated patch from 6 Jan.

Thanks a lot for the work already done and have a great vacation!

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/