Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption

From: Russell King - ARM Linux
Date: Sun Jul 06 2014 - 13:29:21 EST


On Sun, Jul 06, 2014 at 09:22:56PM +0530, pawandeep oza wrote:
> this is the stack trace I was analyzing...please find it below.
>
> #0 [<c072df48>] (__schedule) from [<c072e3a0>]
>
> #1 [<c072e3a0>] (preempt_schedule) from [<c072f244>]
> #2 [<c072f244>] (_raw_spin_unlock_irq) from [<c0394468>]
> #3 [<c0394468>] (__rpm_callback) from [<c03944b0>]
> #4 [<c03944b0>] (rpm_callback) from [<c03957bc>]
>
> #5 [<c03957bc>] (rpm_resume) from [<c0395d18>]
> #6 [<c0395d18>] (__pm_runtime_resume) from [<c04675b4>]
> #7 [<c04675b4>] (sh_mobile_i2c_xfer) from [<c0460390>]
> #8 [<c0460390>] (__i2c_transfer) from [<c0462094>]
>
> #9 [<c0462094>] (i2c_transfer) from [<c03b2738>]
> #10 [<c03b2738>] (d2153_i2c_write_device) from [<c03b08b0>]
> #11 [<c03b08b0>] (d2153_write) from [<c03b1574>]
> #12 [<c03b1574>] (d2153_reg_write) from [<c03b2490>]
>
> #13 [<c03b2490>] (d2153_system_poweroff) from [<c0011278>]
> #14 [<c0011278>] (machine_power_off) from [<c0047d58>]
> #15 [<c0047d58>] (kernel_power_off) from [<c0048420>]
> #16 [<c0048420>] (sys_reboot) from [<c0010240>]

Thanks.

Right, so from the above, we can see that d2153_system_poweroff() is
hooked into pm_power_off hook, which is called with IRQs disabled.
So, the context which d2153_system_poweroff() is called from is
_fundamentally_ one where scheduling and preemption are _not_
permitted.

d2153_system_poweroff calls into the I2C code, which then calls the
shmobile I2C driver. The shmobile I2C driver then calls into the
PM runtime stuff, which ends up re-enabling interrupts. This is the
first bug with the code, and is the one that you're hitting. However,
this is not the only bug.

The second bug here is that i2c_transfer() is permitted to sleep,
waiting for the transaction to complete. As I've already stated, and
as I've said above, scheduling is not allowed with interrupts disabled.
What this means is that calling i2c_transfer() beneath pm_power_off()
is not permitted.

However, there's more problems here: what if one of the secondary CPUs
which received the IPI was in the middle of using that very same I2C
bus. This _can_ deadlock no matter how you try and sort out that
preemption and/or scheduling problem - because the I2C bus may already
be in use but the CPU which was using it has been halted off via a STOP
IPI, preventing it from ever completing the transaction.

So, the above code path has at least _three_ potentially very serious
bugs in it:

1. calling i2c_transfer() beneath pm_power_off() due to scheduling
possibility
2. calling i2c_transfer() beneath pm_power_off() due to deadlock on
bus access
3. calling the runtime PM callbacks beneath pm_power_off() which result
in interrupts being unexpectedly re-enabled.

I suspect that part of the problem here is that this "d2153" thing is a
PMIC, and you need to write to it via I2C to tell it to turn the power
off.

Some of these problems can be mitigated:

(2) can be worked around - the I2C driver receives a shutdown callback
when the system is being rebooted or powered off. The I2C driver can
ensure that the bus is idle (if not, wait for the bus to become idle)
before _safely_ disabling the I2C interface against further transactions.

(3) can also be solved by the mechanism given above - the shutdown
callback /can/ safely call the runtime PM helpers there, and, therefore,
can lock the I2C interfaces into an active state for the shutdown or
reboot operation.

That means that when we hit pm_power_off(), we can be _sure_ that, firstly,
no one else is using the I2C bus, and secondly, it is not in a runtime
suspended state.

However, that also means that /we/ can't use i2c_transfer() either - and
in fact, that's a /good/ thing because i2c_transfer() may sleep on us,
causing bug (1).

So what we need is an out-of-band method for this - we don't have that
though. This is a problem which would need to be solved to the I2C
maintainers satisfaction to make this kind of thing operate reliably.


PS, it would've been nice had the email addresses not had typos. I
finally remembered to fix them and delete the one which bounces. I've
also added the shmobile people.

--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/