Re: [linux-pm] pm-hibernate : possible circular locking dependencydetected

From: Gautham R Shenoy
Date: Mon Apr 06 2009 - 11:21:32 EST


On Mon, Apr 06, 2009 at 10:37:10AM -0400, Alan Stern wrote:
> On Mon, 6 Apr 2009, Rafael J. Wysocki wrote:
>
> > On Monday 06 April 2009, Gautham R Shenoy wrote:
> > > On Sun, Apr 05, 2009 at 03:44:54PM +0200, Ingo Molnar wrote:
> > > >
> > > > * Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > > >
> > > > > On Sunday 05 April 2009, Ming Lei wrote:
> > > > > > kernel version : one simple usb-serial patch against commit
> > > > > > 6bb597507f9839b13498781e481f5458aea33620.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > > > Hmm, CPU hotplug again, it seems.
> > > > >
> > > > > I'm not sure who's the maintainer at the moment. Andrew, is that
> > > > > Gautham?
> > > >
> > > > CPU hotplug tends to land on the scheduler people's desk normally.
> > > >
> > > > But i'm not sure that's the real thing here - key appears to be this
> > > > work_on_cpu() worklet by the cpufreq code:
> > >
> > > Actually, there are two dependency chains here which can lead to a deadlock.
> > > The one we're seeing here is the longer of the two.
> > >
> > > If the relevant locks are numbered as follows:
> > > [1]: cpu_policy_rwsem
> > > [2]: work_on_cpu
> > > [3]: cpu_hotplug.lock
> > > [4]: dpm_list_mtx
> > >
> > >
> > > The individual callpaths are:
> > >
> > > 1) do_dbs_timer()[1] --> dbs_check_cpu() --> __cpufreq_driver_getavg()
> > > |
> > > work_on_cpu()[2] <-- get_measured_perf() <--|
> > >
> > >
> > > 2) pci_device_probe() --> .. --> pci_call_probe() [3] --> work_on_cpu()[2]
> > > |
> > > [4] device_pm_add() <-- ..<-- local_pci_probe() <--|
> >
> > This should block on [4] held by hibernate(). That's why it calls
> > device_pm_lock() after all.
> >
> > > 3) hibernate() --> hibernatioin_snapshot() --> create_image()
> > > |
> > > disable_nonboot_cpus() <-- [4] device_pm_lock() <--|
> > > |
> > > |--> _cpu_down() [3] --> cpufreq_cpu_callback() [1]
> > >
> > >
> > > The two chains which can deadlock are
> > >
> > > a) [1] --> [2] --> [4] --> [3] --> [1] (The one in this log)
> > > and
> > > b) [3] --> [2] --> [4] --> [3]
> >
> > What exactly is the b) scenario?
>
> If I understand correctly it isn't really a deadlock scenario, but it
> is a lockdep violation. The violation is:
>
> The pci_device_probe() path 2) proves that dpm_list_mtx [4] can
> be acquired while cpu_hotplug.lock [3] is held;
>
> The hibernate() path 3) proves that cpu_hotplug.lock [3] can be
> acquired while dpm_list_mtx [4] is held.
>
> The two pathways cannot run simultaneously (and hence cannot deadlock)
> because the prepare() stage of hibernation is supposed to stop all
> device probing. But lockdep will still report a problem.

Thanks for clarifying this Alan. I guess it boils down to teaching
lockdep about this false-positive.

>
> Alan Stern

--
Thanks and Regards
gautham
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/