Re: [PATCH net] net/smc: fix listen processing for SMC-Rv2

From: liuyacan
Date: Mon May 23 2022 - 09:26:48 EST


> >>> From: liuyacan <liuyacan@xxxxxxxxxxxxxxxx>
> >>>
> >>> In the process of checking whether RDMAv2 is available, the current
> >>> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
> >>> smc buf desc, but the latter may fail. Unfortunately, the caller
> >>> will only check the former. In this case, a NULL pointer reference
> >>> will occur in smc_clc_send_confirm_accept() when accessing
> >>> conn->rmb_desc.
> >>>
> >>> This patch does two things:
> >>> 1. Use the return code to determine whether V2 is available.
> >>> 2. If the return code is NODEV, continue to check whether V1 is
> >>> available.
> >>>
> >>> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
> >>> Signed-off-by: liuyacan <liuyacan@xxxxxxxxxxxxxxxx>
> >>> ---
> >>
> >> I am not happy with this patch. You are right that this is a problem,
> >> but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in
> >> smc_find_rdma_v2_device_serv() after the not_found label, just like it is
> >> done in a similar way for the ISM device in smc_find_ism_v1_device_serv().
> >>
> >> Your patch changes many more things, and beside that you eliminated the calls
> >> to smc_find_ism_store_rc() completely, which is not correct.
> >>
> >> Since your patch was already applied (btw. 3:20 hours after you submitted it),
> >> please revert it and resend. Thank you.
> >
> > I also have considered this way, one question is that do we need to do more roll
> > back work before V1 check?
> >
> > Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps:
> >
> > 1. smc_listen_rdma_init()
> > 1.1 smc_conn_create()
> > 1.2 smc_buf_create() --> may fail
> > 2. smc_listen_rdma_reg() --> may fail
> >
> > When later steps fail, Do we need to roll back previous steps?
>
> That is a good question and I think that is a different problem for another patch.
> smc_listen_rdma_init() maybe should call smc_conn_abort() similar to what smc_listen_ism_init()
> does in this situation. And when smc_listen_rdma_reg() fails ... hmm we need to think about this.
>
> We will also discuss this here in our team.

Ok, I will revert this patch and resend a simpler one. Thank you.