Re: [PATCH net-next v5 2/2] net/smc: handle -ENOMEM from smc_wr_alloc_link_mem gracefully
From: Dust Li
Date: Wed Oct 08 2025 - 10:37:55 EST
On 2025-10-06 11:25:22, Mahanta Jambigi wrote:
>On 29/09/25 7:20 am, Dust Li wrote:
>>> diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
>>> index 8d06c8bb14e9..5c18f08a4c8a 100644
>>> --- a/net/smc/smc_core.h
>>> +++ b/net/smc/smc_core.h
>>> @@ -175,6 +175,8 @@ struct smc_link {
>>> struct completion llc_testlink_resp; /* wait for rx of testlink */
>>> int llc_testlink_time; /* testlink interval */
>>> atomic_t conn_cnt; /* connections on this link */
>>> + u16 max_send_wr;
>>> + u16 max_recv_wr;
>>
>> Here, you've moved max_send_wr/max_recv_wr from the link group to individual links.
>> This means we can now have different max_send_wr/max_recv_wr values on two
>> different links within the same link group.
>> Since in Alibaba we doesn't use multi-link configurations, we haven't tested
>
>Does Alibaba always use a single RoCE device for SMC-R? In that case how
>redundancy is achieved if that link goes down?
We expose a virtual RDMA device to our client inside their virtual
machine. The underlying network is already redundant, so it’s got
built-in reliability. You can think of it kind of like virtio-net, but
instead of a regular virtual NIC, it’s an RDMA device.
>
>> this scenario. Have you tested the link-down handling process in a multi-link
>> setup?
>I did test this after you query & don't see any issues. As Halil
>mentioned in worst case scenario one link might perform lesser than the
>other, that too if the kcalloc() failed for that link in
>smc_wr_alloc_link_mem() & succeeded in subsequent request with reduced
>max_send_wr/max_recv_wr size(half).
Great! You can add my
Reviewed-by: Dust Li <dust.li@xxxxxxxxxxxxxxxxx>
>> Otherwise, the patch looks good to me.
>>
>> Best regards,
>> Dust