Re: [RFC PATCH net-next] net/smc: Introduce receive queue flow control support

From: Karsten Graul
Date: Thu Jan 20 2022 - 06:03:44 EST


On 20/01/2022 07:51, Guangguan Wang wrote:
> This implement rq flow control in smc-r link layer. QPs
> communicating without rq flow control, in the previous
> version, may result in RNR (reveive not ready) error, which
> means when sq sends a message to the remote qp, but the
> remote qp's rq has no valid rq entities to receive the message.
> In RNR condition, the rdma transport layer may retransmit
> the messages again and again until the rq has any entities,
> which may lower the performance, especially in heavy traffic.
> Using credits to do rq flow control can avoid the occurrence
> of RNR.
>
> Test environment:
> - CPU Intel Xeon Platinum 8 core, mem 32 GiB, nic Mellanox CX4.
> - redis benchmark 6.2.3 and redis server 6.2.3.
> - redis server: redis-server --save "" --appendonly no
> --protected-mode no --io-threads 7 --io-threads-do-reads yes
> - redis client: redis-benchmark -h 192.168.26.36 -q -t set,get
> -P 1 --threads 7 -n 2000000 -c 200 -d 10
>
> Before:
> SET: 205229.23 requests per second, p50=0.799 msec
> GET: 212278.16 requests per second, p50=0.751 msec
>
> After:
> SET: 623674.69 requests per second, p50=0.303 msec
> GET: 688326.00 requests per second, p50=0.271 msec
>
> The test of redis-benchmark shows that more than 3X rps
> improvement after the implementation of rq flow control.
>
> Signed-off-by: Guangguan Wang <guangguan.wang@xxxxxxxxxxxxxxxxx>
> ---

I really appreciate your effort to improve the performance and solve existing bottle necks,
but please keep in mind that the SMC module implements the IBM SMC protocol that is
described here: https://www.ibm.com/support/pages/node/6326337
(you can find these links in the source code, too).

Your patch makes changes that are not described in this design paper and may lead to
future incompatibilities with other platforms that support the IBM SMC protocol.

For example:
- you start using one of the reserved bytes in struct smc_cdc_msg
- you define a new smc_llc message type 0x0A
- you change the maximum number of connections per link group from 255 to 32

We need to start a discussion about your (good!) ideas with the owners of the protocol.