Re: [PATCHv3] thunderbolt: do not double dequeue a request

From: Mika Westerberg
Date: Fri May 09 2025 - 05:19:31 EST


On Thu, May 08, 2025 at 12:47:18PM +0900, Sergey Senozhatsky wrote:
> On (25/03/28 00:03), Sergey Senozhatsky wrote:
> > Some of our devices crash in tb_cfg_request_dequeue():
> >
> > general protection fault, probably for non-canonical address 0xdead000000000122
> >
> > CPU: 6 PID: 91007 Comm: kworker/6:2 Tainted: G U W 6.6.65
> > RIP: 0010:tb_cfg_request_dequeue+0x2d/0xa0
> > Call Trace:
> > <TASK>
> > ? tb_cfg_request_dequeue+0x2d/0xa0
> > tb_cfg_request_work+0x33/0x80
> > worker_thread+0x386/0x8f0
> > kthread+0xed/0x110
> > ret_from_fork+0x38/0x50
> > ret_from_fork_asm+0x1b/0x30
> >
> > The circumstances are unclear, however, the theory is that
> > tb_cfg_request_work() can be scheduled twice for a request:
> > first time via frame.callback from ring_work() and second
> > time from tb_cfg_request(). Both times kworkers will execute
> > tb_cfg_request_dequeue(), which results in double list_del()
> > from the ctl->request_queue (the list poison deference hints
> > at it: 0xdead000000000122).
> >
> > Do not dequeue requests that don't have TB_CFG_REQUEST_ACTIVE
> > bit set.
>
> Mika, as was discussed in [1] thread we rolled out the fix to
> our fleet and we don't see the crashes anymore. So it's tested
> and verified.

Cool, thanks! Applied to thunderbolt.git/fixes.