Re: [syzbot] kernel BUG in vhost_get_vq_desc

From: Stefano Garzarella
Date: Wed Mar 02 2022 - 04:23:35 EST


On Wed, Mar 02, 2022 at 10:18:07AM +0100, Stefano Garzarella wrote:
On Wed, Mar 02, 2022 at 08:29:41AM +0000, Lee Jones wrote:
On Fri, 18 Feb 2022, Michael S. Tsirkin wrote:

On Thu, Feb 17, 2022 at 05:21:20PM -0800, syzbot wrote:
syzbot has found a reproducer for the following issue on:

HEAD commit: f71077a4d84b Merge tag 'mmc-v5.17-rc1-2' of git://git.kern..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=104c04ca700000
kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
dashboard link: https://syzkaller.appspot.com/bug?extid=3140b17cb44a7b174008
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1362e232700000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11373a6c700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3140b17cb44a7b174008@xxxxxxxxxxxxxxxxxxxxxxxxx

------------[ cut here ]------------
kernel BUG at drivers/vhost/vhost.c:2335!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 3597 Comm: vhost-3596 Not tainted 5.17.0-rc4-syzkaller-00054-gf71077a4d84b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:vhost_get_vq_desc+0x1d43/0x22c0 drivers/vhost/vhost.c:2335
Code: 00 00 00 48 c7 c6 20 2c 9d 8a 48 c7 c7 98 a6 8e 8d 48 89 ca 48 c1 e1 04 48 01 d9 e8 b7 59 28 fd e9 74 ff ff ff e8 5d c8 a1 fa <0f> 0b e8 56 c8 a1 fa 48 8b 54 24 18 48 b8 00 00 00 00 00 fc ff df
RSP: 0018:ffffc90001d1fb88 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff8880234b0000 RSI: ffffffff86d715c3 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff86d706bc R11: 0000000000000000 R12: ffff888072c24d68
R13: 0000000000000000 R14: dffffc0000000000 R15: ffff888072c24bb0
FS: 0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000002 CR3: 000000007902c000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
vhost_vsock_handle_tx_kick+0x277/0xa20 drivers/vhost/vsock.c:522
vhost_worker+0x23d/0x3d0 drivers/vhost/vhost.c:372
kthread+0x2e9/0x3a0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

I don't see how this can trigger normally so I'm assuming
another case of use after free.

Yes, exactly.

I think this issue is related to the issue fixed by this patch merged some days ago upstream: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a58da53ffd70294ebea8ecd0eb45fd0d74add9f9


I patched it. Please see:

https://lore.kernel.org/all/20220302075421.2131221-1-lee.jones@xxxxxxxxxx/T/#t


I'm not sure that patch is avoiding the issue. I'll reply to it.

My bad, I think it should be fine, because vhost_vq_reset() set vq->private_data to NULL and avoids the worker to run.

Thanks,
Stefano