Re: [PATCH v3 1/1] vdpa/mlx5: fix feature negotiation across device reset

From: Si-Wei Liu
Date: Tue Aug 31 2021 - 16:58:30 EST


Gentle ping.

-Siwei

On Wed, Aug 25, 2021 at 5:00 PM Si-Wei Liu <siwliu.kernel@xxxxxxxxx> wrote:
>
> Hi Michael,
>
> This patch already got reviewed, though not sure why it was not picked
> yet. We've been running into this issue quite a lot recently, the
> typical symptom of which is that invalid checksum on TCP or UDP header
> sent by vdpa is causing silent packet drops on the peer host. Only
> ICMP traffic can get through. Would you please merge it at your
> earliest convenience?
>
> Thanks,
> -Siwei
>
> On Wed, Apr 28, 2021 at 6:51 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> >
> > The mlx_features denotes the capability for which
> > set of virtio features is supported by device. In
> > principle, this field needs not be cleared during
> > virtio device reset, as this capability is static
> > and does not change across reset.
> >
> > In fact, the current code seems to wrongly assume
> > that mlx_features can be reloaded or updated on
> > device reset thru the .get_features op. However,
> > the userspace VMM may save a copy of previously
> > advertised backend feature capability and won't
> > need to get it again on reset. In that event, all
> > virtio features reset to zero thus getting disabled
> > upon device reset. This ends up with guest holding
> > a mismatched view of available features with the
> > VMM/host's. For instance, the guest may assume
> > the presence of tx checksum offload feature across
> > reboot, however, since the feature is left disabled
> > on reset, frames with bogus partial checksum are
> > transmitted on the wire.
> >
> > The fix is to retain the features capability on
> > reset, and get it only once from firmware on the
> > vdpa_dev_add path.
> >
> > Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
> > Signed-off-by: Si-Wei Liu <si-wei.liu@xxxxxxxxxx>
> > Acked-by: Eli Cohen <elic@xxxxxxxxxx>
> > Acked-by: Jason Wang <jasowang@xxxxxxxxxx>
> > ---
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 25 +++++++++++++++----------
> > 1 file changed, 15 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > index 25533db..624f521 100644
> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > @@ -1492,16 +1492,8 @@ static u64 mlx_to_vritio_features(u16 dev_features)
> > static u64 mlx5_vdpa_get_features(struct vdpa_device *vdev)
> > {
> > struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
> > - struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > - u16 dev_features;
> >
> > - dev_features = MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, device_features_bits_mask);
> > - ndev->mvdev.mlx_features = mlx_to_vritio_features(dev_features);
> > - if (MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, virtio_version_1_0))
> > - ndev->mvdev.mlx_features |= BIT_ULL(VIRTIO_F_VERSION_1);
> > - ndev->mvdev.mlx_features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> > - print_features(mvdev, ndev->mvdev.mlx_features, false);
> > - return ndev->mvdev.mlx_features;
> > + return mvdev->mlx_features;
> > }
> >
> > static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features)
> > @@ -1783,7 +1775,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
> > teardown_driver(ndev);
> > mlx5_vdpa_destroy_mr(&ndev->mvdev);
> > ndev->mvdev.status = 0;
> > - ndev->mvdev.mlx_features = 0;
> > ++mvdev->generation;
> > return;
> > }
> > @@ -1902,6 +1893,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx)
> > .free = mlx5_vdpa_free,
> > };
> >
> > +static void query_virtio_features(struct mlx5_vdpa_net *ndev)
> > +{
> > + struct mlx5_vdpa_dev *mvdev = &ndev->mvdev;
> > + u16 dev_features;
> > +
> > + dev_features = MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, device_features_bits_mask);
> > + mvdev->mlx_features = mlx_to_vritio_features(dev_features);
> > + if (MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, virtio_version_1_0))
> > + mvdev->mlx_features |= BIT_ULL(VIRTIO_F_VERSION_1);
> > + mvdev->mlx_features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> > + print_features(mvdev, mvdev->mlx_features, false);
> > +}
> > +
> > static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu)
> > {
> > u16 hw_mtu;
> > @@ -2009,6 +2013,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name)
> > init_mvqs(ndev);
> > mutex_init(&ndev->reslock);
> > config = &ndev->config;
> > + query_virtio_features(ndev);
> > err = query_mtu(mdev, &ndev->mtu);
> > if (err)
> > goto err_mtu;
> > --
> > 1.8.3.1
> >