Re: [REGRESSION][v6.8-rc1] virtio-pci: Introduce admin virtqueue

From: Jason Wang
Date: Tue May 07 2024 - 23:34:16 EST


On Sat, May 4, 2024 at 2:10 AM Joseph Salisbury
<joseph.salisbury@xxxxxxxxxxxxx> wrote:
>
> Hi Feng,
>
> During testing, a kernel bug was identified with the suspend/resume
> functionality on instances running in a public cloud [0]. This bug is a
> regression introduced in v6.8-rc1. After a kernel bisect, the following
> commit was identified as the cause of the regression:
>
> fd27ef6b44be ("virtio-pci: Introduce admin virtqueue")

Have a quick glance at the patch it seems it should not damage the
freeze/restore as it should behave as in the past.

But I found something interesting:

1) assumes 1 admin vq which is not what spec said
2) special function for admin virtqueue during freeze/restore, but it
doesn't do anything special than del_vq()
3) lack real users but I guess e.g the destroy_avq() needs to be
synchronized with the one that is using admin virtqueue

>
> I was hoping to get your feedback, since you are the patch author. Do
> you think gathering any additional data will help diagnose this issue?

Yes, please show us

1) the kernel log here.
2) the features that the device has like
/sys/bus/virtio/devices/virtio0/features

> This commit is depended upon by other virtio commits, so a revert test
> is not really straight forward without reverting all the dependencies.
> Any ideas you have would be greatly appreciated.

Thanks

>
>
> Thanks,
>
> Joe
>
> http://pad.lv/2063315
>