Re: [PATCH v2 5/5] PM: domains: Do not call device_pm_check_callbacks() when holding genpd_lock()

From: Ulf Hansson
Date: Tue Jan 17 2023 - 10:12:01 EST


On Mon, 19 Dec 2022 at 16:15, Krzysztof Kozlowski
<krzysztof.kozlowski@xxxxxxxxxx> wrote:
>
> If PM domain on PREEMPT_RT is marked as GENPD_FLAG_RT_SAFE(), the
> genpd_lock() will be a raw spin lock, thus device_pm_check_callbacks()
> must be called outside of the domain lock.
>
> This solves on PREEMPT_RT:
>
> [ BUG: Invalid wait context ]
> 6.1.0-rt5-00325-g8a5f56bcfcca #8 Tainted: G W
> -----------------------------
> swapper/0/1 is trying to lock:
> ffff76e045dec9a0 (&dev->power.lock){+.+.}-{3:3}, at: device_pm_check_callbacks+0x20/0xf0
> other info that might help us debug this:
> context-{5:5}
> 3 locks held by swapper/0/1:
> #0: ffff76e045deb8e8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1c0
> #1: ffffa92b81f825e0 (gpd_list_lock){+.+.}-{4:4}, at: __genpd_dev_pm_attach+0x7c/0x250
> #2: ffff76e04105c7a0 (&genpd->rslock){....}-{2:2}, at: genpd_lock_rawspin+0x1c/0x30
> stack backtrace:
> CPU: 5 PID: 1 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8
> Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
> Call trace:
> dump_backtrace.part.0+0xe0/0xf0
> show_stack+0x18/0x40
> dump_stack_lvl+0x8c/0xb8
> dump_stack+0x18/0x34
> __lock_acquire+0x938/0x2100
> lock_acquire.part.0+0x104/0x28c
> lock_acquire+0x68/0x84
> rt_spin_lock+0x40/0x100
> device_pm_check_callbacks+0x20/0xf0
> dev_pm_domain_set+0x54/0x64
> genpd_add_device+0x258/0x340
> __genpd_dev_pm_attach+0xa8/0x250
> genpd_dev_pm_attach_by_id+0xc4/0x190
> genpd_dev_pm_attach_by_name+0x3c/0x60
> dev_pm_domain_attach_by_name+0x20/0x30
> dt_idle_attach_cpu+0x24/0x90
> psci_cpuidle_probe+0x300/0x4b0
> platform_probe+0x68/0xe0
> really_probe+0xbc/0x2dc
> __driver_probe_device+0x78/0xe0
> driver_probe_device+0x3c/0x160
> __device_attach_driver+0xb8/0x140
> bus_for_each_drv+0x78/0xd0
> __device_attach+0xa8/0x1c0
> device_initial_probe+0x14/0x20
> bus_probe_device+0x9c/0xa4
> device_add+0x3b4/0x8dc
> platform_device_add+0x114/0x234
> platform_device_register_full+0x108/0x1a4
> psci_idle_init+0x6c/0xb0
> do_one_initcall+0x74/0x450
> kernel_init_freeable+0x2e0/0x350
> kernel_init+0x24/0x130
> ret_from_fork+0x10/0x20
>
> Cc: Adrien Thierry <athierry@xxxxxxxxxx>
> Cc: Brian Masney <bmasney@xxxxxxxxxx>
> Cc: linux-rt-users@xxxxxxxxxxxxxxx
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
> ---
> drivers/base/power/domain.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 4dfce1d476f4..db499ba40497 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1666,10 +1666,14 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
> if (ret)
> goto out;
>
> +
> + /* PREEMPT_RT: Must be outside of genpd_lock */
> + device_pm_check_callbacks(dev);
> +
> genpd_lock(genpd);
>
> genpd_set_cpumask(genpd, gpd_data->cpu);
> - dev_pm_domain_set(dev, &genpd->domain);
> + dev_pm_domain_set_no_cb(dev, &genpd->domain);
>
> genpd->device_count++;
> if (gd)

Rather than splitting up the assignment in two steps, I think it
should be perfectly fine to move the call to dev_pm_domain_set()
outside the genpd lock.

Note that, genpd_add_device() is always being called with
gpd_list_lock mutex being held. This prevents the genpd from being
removed, while we use it here.

Moreover, we need a similar change for the call to dev_pm_domain_set()
in genpd_remove_device().

> --
> 2.34.1
>

Kind regards
Uffe