Re: [PATCH] powerpc/irq: Remove HAVE_IRQ_EXIT_ON_IRQ_STACK feature at powerpc platform

From: Kevin Hao
Date: Fri Mar 28 2014 - 04:18:55 EST


On Fri, Mar 28, 2014 at 03:38:32PM +0800, Dongsheng Wang wrote:
> From: Wang Dongsheng <dongsheng.wang@xxxxxxxxxxxxx>
>
> If softirq use hardirq stack, we will get kernel painc when a hard irq coming again
> during __do_softirq enable local irq to deal with softirq action. So we need to switch
> satck into softirq stack when invoke soft irq.
>
> Task--->
> | Task stack
> |
> Interrput->EXCEPTION->do_IRQ->
> ^ | Hard irq stack
> | |
> | irq_exit->__do_softirq->local_irq_enable-- -->local_irq_disable
> | | Hard irq stack
> | |
> | Interrupt coming again
> | There will get a Interrupt nesting |
> ------------------------------------------------------------------------
>
> Trace 1: Trap 900
>
> Kernel stack overflow in process e8152f40, r1=e8e05ec0
> CPU: 0 PID: 2399 Comm: image_compress/ Not tainted 3.13.0-rc3-03475-g2e3f85b #432
> task: e8152f40 ti: c080a000 task.ti: ef176000
> NIP: c05bec04 LR: c0305590 CTR: 00000010
> REGS: e8e05e10 TRAP: 0901 Not tainted (3.13.0-rc3-03475-g2e3f85b)

Could you double check if you got the following patch applied?

commit 1a18a66446f3f289b05b634f18012424d82aa63a
Author: Kevin Hao <haokexin@xxxxxxxxx>
Date: Fri Jan 17 12:25:28 2014 +0800

powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack

Guenter Roeck has got the following call trace on a p2020 board:
Kernel stack overflow in process eb3e5a00, r1=eb79df90
CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
task: eb3e5a00 ti: c0616000 task.ti: ef440000
NIP: c003a420 LR: c003a410 CTR: c0017518
REGS: eb79dee0 TRAP: 0901 Not tainted (3.13.0-rc8-juniper-00146-g19eca00)
MSR: 00029000 <CE,EE,ME> CR: 24008444 XER: 00000000
GPR00: c003a410 eb79df90 eb3e5a00 00000000 eb05d900 00000001 65d87646 00000000
GPR08: 00000000 020b8000 00000000 00000000 44008442
NIP [c003a420] __do_softirq+0x94/0x1ec
LR [c003a410] __do_softirq+0x84/0x1ec
Call Trace:
[eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable)
[eb79dfe0] [c003a970] irq_exit+0xbc/0xc8
[eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c
[ef441f20] [c00046a8] do_IRQ+0x8c/0xf8
[ef441f40] [c000e7f4] ret_from_except+0x0/0x18
--- Exception: 501 at 0xfcda524
LR = 0x10024900
Instruction dump:
7c781b78 3b40000a 3a73b040 543c0024 3a800000 3b3913a0 7ef5bb78 48201bf9
5463103a 7d3b182e 7e89b92e 7c008146 <3ba00000> 7e7e9b78 48000014 57fff87f
Kernel panic - not syncing: kernel stack overflow
CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4
Call Trace:

The reason is that we have used the wrong register to calculate the
ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64).
Just fix it.

As suggested by Benjamin Herrenschmidt, also add the C prototype of the
function in the comment in order to avoid such kind of errors in the
future.

Cc: stable@xxxxxxxxxxxxxxx # 3.12
Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx>
Signed-off-by: Kevin Hao <haokexin@xxxxxxxxx>
Signed-off-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>

Thanks,
Kevin

Attachment: pgpJwzHPqigWi.pgp
Description: PGP signature