Re: PCIe RC\EP virtio rdma solution discussion.

From: Manivannan Sadhasivam
Date: Wed Feb 15 2023 - 03:24:04 EST


On Tue, Feb 07, 2023 at 02:45:27PM -0500, Frank Li wrote:
> From: Frank Li <Frank.li@xxxxxxx>
>
> Recently more and more people are interested in PCI RC and EP connection,
> especially network usage cases. I upstreamed a vntb solution last year.
> But the transfer speed is not good enough. I initialized a discussion
> at https://lore.kernel.org/imx/d098a631-9930-26d3-48f3-8f95386c8e50@xxxxxx/T/#t
>
> ┌─────────────────────────────────┐ ┌──────────────┐
> │ │ │ │
> │ │ │ │
> │ VirtQueue RX │ │ VirtQueue │
> │ TX ┌──┐ │ │ TX │
> │ ┌─────────┐ │ │ │ │ ┌─────────┐ │
> │ │ SRC LEN ├─────┐ ┌──┤ │◄────┼───┼─┤ SRC LEN │ │
> │ ├─────────┤ │ │ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │ │ │
> │ ├─────────┤ │ │ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │ │ │
> │ └─────────┘ │ │ └──┘ │ │ └─────────┘ │
> │ │ │ │ │ │
> │ RX ┌───┼──┘ TX │ │ RX │
> │ ┌─────────┐ │ │ ┌──┐ │ │ ┌─────────┐ │
> │ │ │◄┘ └────►│ ├─────┼───┼─┤ │ │
> │ ├─────────┤ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │
> │ ├─────────┤ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │
> │ └─────────┘ │ │ │ │ └─────────┘ │
> │ virtio_net └──┘ │ │ virtio_net │
> │ Virtual PCI BUS EDMA Queue │ │ │
> ├─────────────────────────────────┤ │ │
> │ PCI EP Controller with eDMA │ │ PCI Host │
> └─────────────────────────────────┘ └──────────────┘
>
> Basic idea is
> 1. Both EP and host probe virtio_net driver
> 2. There are two queues, one is the EP side(EQ), the other is the Host side.
> 3. EP side epf driver map Host side’s queue into EP’s space. Called HQ.
> 4. One working thread
> 5. pick one TX from EQ and RX from HQ, combine and generate EDMA requests,
> and put into the DMA TX queue.
> 6. Pick one RX from EQ and TX from HQ, combine and generate EDMA requests,
> and put into the DMA RX queue.
> 7. EDMA done irq will mark related item in EP and HQ finished.
>
> The whole transfer is zero copied and uses a DMA queue.
>
> The Shunsuke Mie implemented the above idea.
> https://lore.kernel.org/linux-pci/CANXvt5q_qgLuAfF7dxxrqUirT_Ld4B=POCq8JcB9uPRvCGDiKg@xxxxxxxxxxxxxx/T/#t
>
>
> Similar solution posted at 2019, except use memcpy from/to PCI EP map windows.
> Using DMA should be simpler because EDMA can access the whole HOST\EP side memory space.
> https://lore.kernel.org/linux-pci/9f8e596f-b601-7f97-a98a-111763f966d1@xxxxxx/T/
>
> Solution 1 (Based on shunsuke):
>
> Both EP and Host side use virtio.
> Using EDMA to simplify data transfer and improve transfer speed.
> RDMA implement based on RoCE
> - proposal: https://lore.kernel.org/all/20220511095900.343-1-xieyongji@xxxxxxxxxxxxx/T/
> - presentation on kvm forum: https://youtu.be/Qrhv6hC_YK4
>
> Solution 2(2020, Kishon)
>
> Previous https://lore.kernel.org/linux-pci/20200702082143.25259-1-kishon@xxxxxx/
> EP side use vhost, RC side use virtio.
> I don’t think anyone works on this thread now.
> If using eDMA, it needs both sides to have a transfer queue.
> I don't know how to easily implement it on the vhost side.
>
> Solution 3(I am working on)
>
> Implement infiniband rdma driver at both EP and RC side.
> EP side build EDMA hardware queue based on EP/RC side’s send and receive
> queue and when eDMA finished, write status to complete queue for both EP/RC
> side. Use ipoib implement network transfer.
>
>
> The whole upstream effort is quite huge for these. I don’t want to waste
> time and efforts because direction is wrong.
>
> I think Solution 1 is an easy path.
>

I didn't had time to look into Shunsuke's series, but from the initial look
of the proposed solutions, option 1 seems to be the best for me.

Thanks,
Mani

>
>

--
மணிவண்ணன் சதாசிவம்