Re: [RFC][PATCH RT] acpi/rt: Convert acpi lock back to araw_spinlock_t

From: John Kacur
Date: Wed Feb 13 2013 - 11:46:58 EST




On Wed, 13 Feb 2013, Steven Rostedt wrote:

> We hit the following bug with 3.6-rt:
>
> [ 5.898990] BUG: scheduling while atomic: swapper/3/0/0x00000002
> [ 5.898991] no locks held by swapper/3/0.
> [ 5.898993] Modules linked in:
> [ 5.898996] Pid: 0, comm: swapper/3 Not tainted 3.6.11-rt28.19.el6rt.x86_64.debug #1
> [ 5.898997] Call Trace:
> [ 5.899011] [<ffffffff810804e7>] __schedule_bug+0x67/0x90
> [ 5.899028] [<ffffffff81577923>] __schedule+0x793/0x7a0
> [ 5.899032] [<ffffffff810b4e40>] ? debug_rt_mutex_print_deadlock+0x50/0x200
> [ 5.899034] [<ffffffff81577b89>] schedule+0x29/0x70
> [ 5.899036] BUG: scheduling while atomic: swapper/7/0/0x00000002
> [ 5.899037] no locks held by swapper/7/0.
> [ 5.899039] [<ffffffff81578525>] rt_spin_lock_slowlock+0xe5/0x2f0
> [ 5.899040] Modules linked in:
> [ 5.899041]
> [ 5.899045] [<ffffffff81579a58>] ? _raw_spin_unlock_irqrestore+0x38/0x90
> [ 5.899046] Pid: 0, comm: swapper/7 Not tainted 3.6.11-rt28.19.el6rt.x86_64.debug #1
> [ 5.899047] Call Trace:
> [ 5.899049] [<ffffffff81578bc6>] rt_spin_lock+0x16/0x40
> [ 5.899052] [<ffffffff810804e7>] __schedule_bug+0x67/0x90
> [ 5.899054] [<ffffffff8157d3f0>] ? notifier_call_chain+0x80/0x80
> [ 5.899056] [<ffffffff81577923>] __schedule+0x793/0x7a0
> [ 5.899059] [<ffffffff812f2034>] acpi_os_acquire_lock+0x1f/0x23
> [ 5.899062] [<ffffffff810b4e40>] ? debug_rt_mutex_print_deadlock+0x50/0x200
> [ 5.899068] [<ffffffff8130be64>] acpi_write_bit_register+0x33/0xb0
> [ 5.899071] [<ffffffff81577b89>] schedule+0x29/0x70
> [ 5.899072] [<ffffffff8130be13>] ? acpi_read_bit_register+0x33/0x51
> [ 5.899074] [<ffffffff81578525>] rt_spin_lock_slowlock+0xe5/0x2f0
> [ 5.899077] [<ffffffff8131d1fc>] acpi_idle_enter_bm+0x8a/0x28e
> [ 5.899079] [<ffffffff81579a58>] ? _raw_spin_unlock_irqrestore+0x38/0x90
> [ 5.899081] [<ffffffff8107e5da>] ? this_cpu_load+0x1a/0x30
> [ 5.899083] [<ffffffff81578bc6>] rt_spin_lock+0x16/0x40
> [ 5.899087] [<ffffffff8144c759>] cpuidle_enter+0x19/0x20
> [ 5.899088] [<ffffffff8157d3f0>] ? notifier_call_chain+0x80/0x80
> [ 5.899090] [<ffffffff8144c777>] cpuidle_enter_state+0x17/0x50
> [ 5.899092] [<ffffffff812f2034>] acpi_os_acquire_lock+0x1f/0x23
> [ 5.899094] [<ffffffff8144d1a1>] cpuidle899101] [<ffffffff8130be13>] ?
>
> As the acpi code disables interrupts in acpi_idle_enter_bm, and calls
> code that grabs the acpi lock, it causes issues as the lock is currently
> in RT a sleeping lock.
>
> The lock was converted from a raw to a sleeping lock due to some
> previous issues, and tests that showed it didn't seem to matter.
> Unfortunately, it did matter for one of our boxes.
>
> This patch converts the lock back to a raw lock. I've run this code on a
> few of my own machines, one being my laptop that uses the acpi quite
> extensively. I've been able to suspend and resume without issues.
>
> But as it may have issues still with other hardware, I'm posting this as
> an RFC.
>
> Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
>
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 9eaf708..83d38ed 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -1432,7 +1432,7 @@ void acpi_os_delete_lock(acpi_spinlock handle)
> acpi_cpu_flags acpi_os_acquire_lock(acpi_spinlock lockp)
> {
> acpi_cpu_flags flags;
> - spin_lock_irqsave(lockp, flags);
> + raw_spin_lock_irqsave(lockp, flags);
> return flags;
> }
>
> @@ -1442,7 +1442,7 @@ acpi_cpu_flags acpi_os_acquire_lock(acpi_spinlock lockp)
>
> void acpi_os_release_lock(acpi_spinlock lockp, acpi_cpu_flags flags)
> {
> - spin_unlock_irqrestore(lockp, flags);
> + raw_spin_unlock_irqrestore(lockp, flags);
> }
>
> #ifndef ACPI_USE_LOCAL_CACHE
> diff --git a/include/acpi/platform/aclinux.h b/include/acpi/platform/aclinux.h
> index 7509be3..484cf8c 100644
> --- a/include/acpi/platform/aclinux.h
> +++ b/include/acpi/platform/aclinux.h
> @@ -71,7 +71,7 @@
> #define strtoul simple_strtoul
>
> #define acpi_cache_t struct kmem_cache
> -#define acpi_spinlock spinlock_t *
> +#define acpi_spinlock raw_spinlock_t *
> #define acpi_cpu_flags unsigned long
>
> #else /* !__KERNEL__ */
> @@ -170,7 +170,7 @@ static inline void *acpi_os_acquire_object(acpi_cache_t * cache)
> \
> if (lock) { \
> *(__handle) = lock; \
> - spin_lock_init(*(__handle)); \
> + raw_spin_lock_init(*(__handle)); \
> } \
> lock ? AE_OK : AE_NO_MEMORY; \
> })
>
>

Thanks Steven. That looks way better than the previous revert.
Applying to the latest RH testing queue, will let you know.

John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/