Re: [RFC PATCH 00/28] Removing struct page from P2PDMA

From: Logan Gunthorpe
Date: Tue Jun 25 2019 - 11:58:20 EST




On 2019-06-25 1:20 a.m., Christoph Hellwig wrote:
> On Mon, Jun 24, 2019 at 10:07:56AM -0600, Logan Gunthorpe wrote:
>>> For one passing a dma_addr_t through the block layer is a layering
>>> violation, and one that I think will also bite us in practice.
>>> The host physical to PCIe bus address mapping can have offsets, and
>>> those offsets absolutely can be different for differnet root ports.
>>> So with your caller generated dma_addr_t everything works fine with
>>> a switched setup as the one you are probably testing on, but on a
>>> sufficiently complicated setup with multiple root ports it can break.
>>
>> I don't follow this argument. Yes, I understand PCI Bus offsets and yes
>> I understand that they only apply beyond the bus they're working with.
>> But this isn't *that* complicated and it should be the responsibility of
>> the P2PDMA code to sort out and provide a dma_addr_t for. The dma_addr_t
>> that's passed through the block layer could be a bus address or it could
>> be the result of a dma_map_* request (if the transaction is found to go
>> through an RC) depending on the requirements of the devices being used.
>
> You assume all addressing is done by the PCI bus address. If a device
> is addressing its own BAR there is no reason to use the PCI bus address,
> as it might have much more intelligent schemes (usually bar + offset).

Yes, that will be a bit tricky regardless of what we do.

>>> Also duplicating the whole block I/O stack, including hooks all over
>>> the fast path is pretty much a no-go.
>>
>> There was very little duplicate code in the patch set. (Really just the
>> mapping code). There are a few hooks, but in practice not that many if
>> we ignore the WARN_ONs. We might be able to work to reduce this further.
>> The main hooks are: when we skip bouncing, when we skip integrity prep,
>> when we split, and when we map. And the patchset drops the PCI_P2PDMA
>> hook when we map. So we're talking about maybe three or four extra ifs
>> that would likely normally be fast due to the branch predictor.
>
> And all of those add code to the block layer fast path.

If we can't add any ifs to the block layer, there's really nothing we
can do.

So then we're committed to using struct page for P2P?

Logan