Re: PM domain change on unbound devices warning on ipmi_si unload

From: Tomas Winkler
Date: Sun Jan 31 2016 - 16:38:28 EST


On Fri, Jan 29, 2016 at 11:45 PM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> On Friday, January 29, 2016 12:56:14 PM Joe Lawrence wrote:
>> On 01/29/2016 12:01 PM, Steven Rostedt wrote:
>> > On Thu, Jan 28, 2016 at 02:13:04PM -0600, Corey Minyard wrote:
>> >> Tomeu, you added that check in
>> >>
>> >> [989561de9b5112999475b406557d9c7e9e59c041] PM / Domains: add setter for
>> >> dev.pm_domain
>> >>
>> >> and either something is wrong in the platform device handling or elsewhere
>> >> in the device code, if
>> >> that check is valid.
>> >>
>> >
>> > FYI, I'm hitting the exact same error on shutdown on one of my boxes.
>> >
>> > Please Cc me on updates.
>> >
>> > -- Steve
>> >
>> >
>> > [53591.087861] kvm: exiting hardware virtualization
>> > [53591.104798] ------------[ cut here ]------------
>> > [53591.110058] WARNING: CPU: 0 PID: 1 at /home/rostedt/work/git/linux-trace.git/drivers/base/power/common.c:150 dev_pm_domain_set+0x82/0x90()
>> > [53591.123716] PM domains can only be changed for unbound devices
>> > [53591.130158] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge stp llc bluetooth lockd grace snd_hda_codec_hdmi snd_hda_codec_r
>> > ealtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core vhost_net tun vhost x86_pkg_temp_thermal iTCO_wdt snd_seq snd_seq_device snd_pcm macvtap coretemp me
>> > i_me iTCO_vendor_support hp_wmi rfkill sparse_keymap macvlan kvm_intel snd_timer mei lpc_ich snd i2c_i801 soundcore mfd_core kvm irqbypass acpi_cpufreq serio_raw wmi uinput crc32_p
>> > clmul i915 crc32c_intel i2c_algo_bit e1000e drm_kms_helper ptp drm pps_core i2c_core video sunrpc
>> > [53591.190440] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.5.0-rc1-test+ #155
>> > [53591.198453] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
>> > [53591.208031] ffff8801195abc90 ffff8801195abc90 ffffffff8143ef63 ffff8801195abcd8
>> > [53591.216168] ffff8801195abcc8 ffffffff810acad6 ffff8801192d2328 0000000000000000
>> > [53591.224299] 0000000000000001 ffff8801192d2388 00000000fee1dead ffff8801195abd28
>> > [53591.232429] Call Trace:
>> > [53591.235483] [<ffffffff8143ef63>] dump_stack+0x44/0x61
>> > [53591.241232] [<ffffffff810acad6>] warn_slowpath_common+0x86/0xc0
>> > [53591.247842] [<ffffffff810acb5c>] warn_slowpath_fmt+0x4c/0x50
>> > [53591.254193] [<ffffffff810acb15>] ? warn_slowpath_fmt+0x5/0x50
>> > [53591.260625] [<ffffffff8157cf42>] dev_pm_domain_set+0x82/0x90
>> > [53591.266970] [<ffffffffa03f255e>] mei_me_remove+0xee/0x120 [mei_me]
>> > [53591.273842] [<ffffffff8147fe96>] pci_device_shutdown+0x36/0x70
>> > [53591.280373] [<ffffffff81570680>] device_shutdown+0xe0/0x1e0
>> > [53591.286642] [<ffffffff810d6df6>] kernel_restart_prepare+0x36/0x40
>> > [53591.293439] [<ffffffff810d6f62>] kernel_restart+0x12/0x60
>> > [53591.299543] [<ffffffff810d72ae>] SYSC_reboot+0x1ce/0x1f0
>> > [53591.305562] [<ffffffffa0008077>] ? 0xffffffffa0008077
>> > [53591.311323] [<ffffffff811b2e60>] ? stack_trace_call+0x40/0x60
>> > [53591.317780] [<ffffffffa0008077>] ? 0xffffffffa0008077
>> > [53591.323552] [<ffffffff8143eeb4>] ? _atomic_dec_and_lock+0x44/0xaf
>> > [53591.330374] [<ffffffff812a302b>] ? iput+0xbb/0x2c0
>> > [53591.335880] [<ffffffff810d7325>] ? SyS_reboot+0x5/0x10
>> > [53591.341748] [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
>> > [53591.348743] [<ffffffff810d732e>] SyS_reboot+0xe/0x10
>> > [53591.354440] [<ffffffff8185ee32>] entry_SYSCALL_64_fastpath+0x12/0x76
>> > [53591.361534] ---[ end trace 63b298fc6d5920e4 ]---
>> > [53591.377714] e1000e: EEE TX LPI TIMER: 00000011
>> > [53591.468245] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>> > [53591.487321] reboot: Restarting system
>> > [53591.491679] reboot: machine restart
>> >
>>
>> Looks like Rafael adjusted for the platform shutdown case here:
>>
>> https://lkml.org/lkml/2016/1/11/515
>>
>> Perhaps that would applicable to the pci shutdown as well?
>
> To the MEI driver's shutdown rather. mei_me_remove() clears the pm_domain
> pointer which is sort of questionable, but then the warning may be overkill
> for this case.

Since MEI device is running its own power management, in some cases
we are using PM domains just to avoid going through the PCI runtime
pm handlers to avoid going to D3. IIRC we couldn't use
PCI_DEV_FLAGS_NO_D3 as this not unique for runtime handlers
I think this requirement is unique to the MEI device, so I'm not
sure it's worth to push it into the PCI layer, maybe we just quite the
warning somehow.


Tomas