Re: [PATCH] hv_netvsc: Make sure out channel is fully opened on send

From: Mohammed Gamal
Date: Thu Sep 27 2018 - 04:57:15 EST


On Wed, 2018-09-26 at 17:13 +0000, Haiyang Zhang wrote:
> > -----Original Message-----
> > From: Mohammed Gamal <mgamal@xxxxxxxxxx>
> > Sent: Wednesday, September 26, 2018 12:34 PM
> > To: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>; netdev@xxxxxxxxxxxx
> > org
> > Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
> > <haiyangz@xxxxxxxxxxxxx>; vkuznets <vkuznets@xxxxxxxxxx>;
> > otubo@xxxxxxxxxx; cavery <cavery@xxxxxxxxxx>; linux-
> > kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; Mohammed
> > Gamal
> > <mgamal@xxxxxxxxxx>
> > Subject: [PATCH] hv_netvsc: Make sure out channel is fully opened
> > on send
> >
> > Dring high network traffic changes to network interface parameters
> > such as
> > number of channels or MTU can cause a kernel panic with a NULL
> > pointer
> > dereference. This is due to netvsc_device_remove() being called and
> > deallocating the channel ring buffers, which can then be accessed
> > by
> > netvsc_send_pkt() before they're allocated on calling
> > netvsc_device_add()
> >
> > The patch fixes this problem by checking the channel state and
> > returning
> > ENODEV if not yet opened. We also move the call to
> > hv_ringbuf_avail_percent()
> > which may access the uninitialized ring buffer.
> >
> > Signed-off-by: Mohammed Gamal <mgamal@xxxxxxxxxx>
> > ---
> > Âdrivers/net/hyperv/netvsc.c | 7 ++++++-
> > Â1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc.c
> > b/drivers/net/hyperv/netvsc.c index
> > fe01e14..75f1b31 100644
> > --- a/drivers/net/hyperv/netvsc.c
> > +++ b/drivers/net/hyperv/netvsc.c
> > @@ -825,7 +825,12 @@ static inline int netvsc_send_pkt(
> > Â struct netdev_queue *txq = netdev_get_tx_queue(ndev,
> > packet->q_idx);
> > Â u64 req_id;
> > Â int ret;
> > - u32 ring_avail =
> > hv_get_avail_to_write_percent(&out_channel-
> > > outbound);
> >
> > + u32 ring_avail;
> > +
> > + if (out_channel->state != CHANNEL_OPENED_STATE)
> > + return -ENODEV;
> > +
> > + ring_avail = hv_get_avail_to_write_percent(&out_channel-
> > >outbound);
>
> When you reproducing the NULL ptr panic, does your kernel include the
> following patch?
> hv_netvsc: common detach logic
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/c
> ommit/?id=7b2ee50c0cd513a176a26a71f2989facdd75bfea
>
Yes it is included. And the commit did reduce the occurrence of this
race condition, but it still nevertheless occurs albeit rarely.

> We call netif_tx_disable(ndev) and netif_device_detach(ndev) before
> doing the changesÂ
> on MTU or #channels. So there should be no call to start_xmit() when
> channel is not ready.
>
> If you see the check for CHANNEL_OPENED_STATE is still necessary on
> upstream kernel (includingÂ
> the patch " common detach logic "), we should debug further on the
> code and find out theÂ
> root cause.
>
> Thanks,
> - Haiyang
>