Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating

From: Jason Gunthorpe
Date: Sun Jan 21 2018 - 15:40:43 EST


> Hmm, this is actually consistent with the example below [1].
>
> AIU from the example, it seems that the dma_wmb/dma_rmb barriers are good
> for synchronizing cpu/device accesses to the "Streaming DMA mapped" buffers
> (the descriptors, went through the dma_map_page() API), but not for the
> doorbell (a coherent memory, typically allocated via dma_alloc_coherent)
> that requires using the stronger wmb() barrier.

If x86 truely requires a wmb() (aka SFENCE) here then the userspace
RDMA stuff is broken too, and that has been tested to death at this
point..

I looked into this at one point and I thought I concluded that x86 did
not require a SFENCE between a posted PCI write and writes to system
memory to guarnetee order with-respect-to the PCI device?

Well, so long as non-temporal stores and other specialty accesses are
not being used.. Is there a chance a fancy sse optimized memcpy or
memset, crypto or something is being involved here?

However, Documentation/memory-barriers.txt does seem pretty clear that
the kernel definition of wmb() makes it required here, even if it
might be overkill for x86?

Jason