Re: [PATCH net-next v2 2/2] page_pool: optimize the cpu sync operation when DMA mapping

From: Ilias Apalodimas
Date: Tue Aug 24 2021 - 05:05:22 EST


Hi Yunsheng,

+cc Lorenzo, which has done some tests on non-coherent platforms

On Tue, 24 Aug 2021 at 10:00, Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>
> On 2021/8/23 20:42, Ilias Apalodimas wrote:
> > On Mon, Aug 23, 2021 at 11:56:48AM +0800, Yunsheng Lin wrote:
> >> On 2021/8/20 17:39, Ilias Apalodimas wrote:
> >>> On Fri, Aug 20, 2021 at 02:56:51PM +0800, Yunsheng Lin wrote:
>
> [..]
> >>
> >> https://elixir.bootlin.com/linux/latest/source/kernel/dma/direct.h#L104
> >>
> >> The one thing I am not sure about is that the pool->p.offset
> >> and pool->p.max_len are used to decide the sync range before this
> >> patch, while the sync range is the same as the map range when doing
> >> the sync in dma_map_page_attrs().
> >
> > I am not sure I am following here. We always sync the entire range as well
> > in the current code as the mapping function is called with max_len.
> >
> >>
> >> I assumed the above is not a issue? only sync more than we need?
> >> and it won't hurt the performance?
> >
> > We can sync more than we need, but if it's a non-coherent architecture,
> > there's a performance penalty.
>
> Since I do not have any performance data to prove if there is a
> performance penalty for non-coherent architecture, I will drop it:)

I am pretty sure it does affect it. Unless I am missing something the
patch simply re-arranges calls to avoid calling dma_map_page_attrs()
right?
However since dma_map_page_attrs() won't do anything sync-related
since it's called with DMA_ATTR_SKIP_CPU_SYNC, I doubt calling it will
have any measurable difference. If there is, we should pick it up.


Regards
/Ilias
>
> >
> > Regards
> > /Ilias
> >>