Re: Enabling peer to peer device transactions for PCIe devices

From: Petrosyan, Ludwig
Date: Tue Oct 24 2017 - 01:58:36 EST


Yes I agree it has to be started with the write transaction, according of PCIe standard all write transaction are address routed, and I agree with Logan:
if in write transaction TLP the endpoint address written in header the TLP should not touch CPU, the PCIe Switch has to route it to endpoint.
The idea was: in MTCA system there is PCIe Switch on MCH (MTCA crate HUB) this switch connects CPU to other Crate Slots, so one port is Upstream and others are Downstream ports, DMA read from CPU is usual write on endpoint side, Xilinx DMA core has two registers Destination Address and Source Address,
device driver to make DMA has to set up these registers,
usually device driver to start DMA read from Board sets Source address as FPGA memory address and Destination address is DMA prepared system address,
in case of testing p2p I set Destination address as physical address of other endpoint.
More detailed:
we have so called pcie universal driver: the idea behind is
1. all pcie configuration staff, find enabled BARs, mapping BARs, usual read/write and common ioctl (get slot number, get driver version ...) implemented in universal driver and EXPORTed.
2. if some system function in new kernel are changed we change it only in universal parts (easy support a big number of drivers )
so the universal driver something like PCIe Driver API
3. the universal driver provides read/writ functions so we have the same device access API for any PCIe device, we could use the same user application with any PCIe device

now. during BARs finding and mapping universal driver keeps pcie endpoint physical address in some internal structures, any top driver may get physical address
of other pcie endpoint by slot number.
in may case during get_resorce the physical address is 0xB2000000, I check lspci -H1 -vvvv -s psie switch port bus address (the endpoint connected to this port, checked by lspci -H1 -t) the same address (0xB200000) is the memory behind bridge,
I want to make p2p writes to offset 0x40000, so I set DMA destination address 0xB2400000
is something wrong?

thanks for help
regards

Ludwig

----- Original Message -----
From: "Logan Gunthorpe" <logang@xxxxxxxxxxxx>
To: "David Laight" <David.Laight@xxxxxxxxxx>, "Petrosyan, Ludwig" <ludwig.petrosyan@xxxxxxx>
Cc: "Alexander Deucher" <Alexander.Deucher@xxxxxxx>, "linux-kernel" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-rdma" <linux-rdma@xxxxxxxxxxxxxxx>, "linux-nvdimm" <linux-nvdimm@xxxxxxxxxxxx>, "Linux-media" <Linux-media@xxxxxxxxxxxxxxx>, "dri-devel" <dri-devel@xxxxxxxxxxxxxxxxxxxxx>, "linux-pci" <linux-pci@xxxxxxxxxxxxxxx>, "John Bridgman" <John.Bridgman@xxxxxxx>, "Felix Kuehling" <Felix.Kuehling@xxxxxxx>, "Serguei Sagalovitch" <Serguei.Sagalovitch@xxxxxxx>, "Paul Blinzer" <Paul.Blinzer@xxxxxxx>, "Christian Koenig" <Christian.Koenig@xxxxxxx>, "Suravee Suthikulpanit" <Suravee.Suthikulpanit@xxxxxxx>, "Ben Sander" <ben.sander@xxxxxxx>
Sent: Tuesday, 24 October, 2017 00:04:26
Subject: Re: Enabling peer to peer device transactions for PCIe devices

On 23/10/17 10:08 AM, David Laight wrote:
> It is also worth checking that the hardware actually supports p2p transfers.
> Writes are more likely to be supported then reads.
> ISTR that some intel cpus support some p2p writes, but there could easily
> be errata against them.

Ludwig mentioned a PCIe switch. The few switches I'm aware of support
P2P transfers. So if everything is setup correctly, the TLPs shouldn't
even touch the CPU.

But, yes, generally it's a good idea to start with writes and see if
they work first.

Logan