Re: irq_desc use-after-free in smp_irq_move_cleanup_interrupt

From: Thomas Gleixner
Date: Wed Nov 25 2015 - 13:17:13 EST


Joe,

On Mon, 23 Nov 2015, Joe Lawrence wrote:

Nice detective work!

> The irq_desc is in R15: ffff88102a8196f8
>
> This irq_desc is no longer allocated, it's been filled with the slub debug
> poison pattern (hence the spinlock is stuck):
>
> The irq vector is in RBX: 0000000000000091
>
> But from a custom crash gdb script, the freed irq_desc can still be found
> in CPU 1's vector_irq[145]:
>
> cpu[1] vector_irq[145] irq_desc @ 0xffff88102a8196f8
>
> Sifting through a heavily grepped and abbreviated trace log:
>
> - The irq moved from CPU 1, vector 145 to CPU 44, vector 134
> - Both CPUs skip cleaning up their vector_irq[] entries for this irq
> because data->move_in_progress is set (is this normal?)

Yes. The cleanup interrupt is not targetting a particular vector.

> - The irq moves again to CPU 34, vector 123
> - The underlying device is removed @ 593.106514 jiffies
> - The irq_desc is freed @ 593.239879 jiffies
> - CPU 1 is last heard from @ 1022.830083 jiffies
>
> [001] 22.936764: __assign_irq_vector : cpu 44 : vector=134 -> 0xffff88102a8196f8
> [044] 35.209945: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=134 0xffff88102a8196f8
> [001] 35.212370: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8
> [044] 35.674114: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=134 0xffff88102a8196f8
> [001] 35.675395: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8
> [044] 36.265716: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=134 0xffff88102a8196f8
> [044] 36.517785: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=134 0xffff88102a8196f8
> [001] 36.520115: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8
> [001] 54.636651: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8
> [001] 54.636722: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8

So I assume, that no interrupt happened so far.

> [044] 61.670267: __assign_irq_vector : cpu 34 : vector=123 -> 0xffff88102a8196f8

Now it's moved again.

> [001] 61.670979: smp_irq_move_cleanup_interrupt : data->move_in_progress : vector=145 0xffff88102a8196f8
> [044] 61.696120: smp_irq_move_cleanup_interrupt : cpu (this) : vector=134 0xffff88102a8196f8 -> (nil)

> [034] 69.086107: smp_irq_move_cleanup_interrupt : vector == data->cfg.vector && ... : vector=123 0xffff88102a8196f8
> [000] 593.239811: clear_irq_vector : 1 : cpu 34 : vector=123 0xffff88102a8196f8 -> (nil)
> [000] 593.239879: free_desc : desc @ 0xffff88102a8196f8
> [001] 1022.830083: smp_irq_move_cleanup_interrupt : IS_ERR_OR_NULL : vector=144 (nil)

Ok. Here is issue. So I assume the following happens:

CPU0 CPU1 CPU34 CPU44 CPUxx
send_IPI(CLEANUP)
smp_irq_move_cleanup_interrupt
sees data->move_in_progress

device_irq happens
data->move_in_progress = 0
send_IPI(CLEANUP)

Does not receive another
cleanup IPI for whatever
reason .... See below.
smp_irq_move_cleanup_interrupt
vector = NULL

free_vector on CPU34

Does not clean CPU1 because
data->move_in_progress is not set.

free_desc

send_IPI(CLEANUP)
smp_irq_move_cleanup_interrupt
Accesses random memory

> Prior to a782a7e46bb5 "x86/irq: Store irq descriptor in vector array", the
> vector_irq array held irq values, those interested in the irq_desc would
> have to perform their own irq_to_desc() (ie, a radix_tree_lookup of the
> irq_desc_tree). The radix tree is updated in free_desc(), so any
> subsequent lookups would fail.

Yes, that above race has existed forever and the irq_to_desc() check
papered over it. In case that the irq number was reassigned the thing
operated on the wrong descriptor. Pretty bad as well as it fiddles
with the wrong bits. Though it clears the vector ...

The problem is actually in the vector assignment code.

> [001] 22.936764: __assign_irq_vector : cpu 44 : vector=134 -> 0xffff88102a8196f8

No interrupt happened so far. So nothing cleans up the vector on cpu 1

> [044] 61.670267: __assign_irq_vector : cpu 34 : vector=123 -> 0xffff88102a8196f8

Now that moves it from 44 to 34 and ignores that cpu 1 still has the
vector assigned. __assign_irq_vector unconditionally overwrites
data->old_domain, so the bit of cpu 1 is lost ....

I'm staring into the code to figure out a fix ....

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/