RE: pcm|dmaengine|imx-sdma race condition on i.MX6

From: Robin Gong
Date: Thu Aug 20 2020 - 11:01:56 EST


On 2020/08/19 22:26 Benjamin Bara - SKIDATA <Benjamin.Bara@xxxxxxxxxxx> wrote:
> > -----Original Message-----
> > From: Lars-Peter Clausen <lars@xxxxxxxxxx>
> > Sent: Mittwoch, 19. August 2020 13:08
> > I think this might be an sdma specific problem after all.
> > dmaengine_terminate_async() will issue a request to stop the DMA. But
> > it is still safe to issue the next transfer, even without calling
> > dmaengine_synchronize(). The DMA should start the new transfer at its
> > earliest convenience in that case.
> >
> > dmaegine_synchronize() is so that the consumer has a guarantee that
> > the DMA is finished using the resources (e.g. the memory buffers)
> > associated with the DMA transfer so it can safely free them.
>
> Thank you for the clarifications!
>
> > I don't know how feasible this is to implement in the SDMA dmaengine
> > driver. But I think what is should do is to have some flag to indicate
> > if a terminate is in progress. If a new transfer is issued while
> > terminate is in progress the transfer should go on a list. Once
> > terminate finishes it should check the list and start the transfer if
> > there are any on the list.
>
> IMHO that's nearly what Robin's patches does, so this should be sufficient:
> Putting the descriptors to free in an extra termination list and ensuring that a
> new transfer is handled after the last termination is done.
Hello Benjamin,
It seems Lars's list is not the extra termination list in my patch.
Instead, he means the next descriptor should be moved in another pending list
if the last channel terminated not done yet and restart to handle the pending list
after the last channel terminated done if the list is not empty.

I like the idea, but on SDMA that race condition may never happen I think, because
that once the next descriptor got during usleep_range in sdma_channel_terminate_work,
means the last channel stopped indeed. I have mentioned before:
' because load context(sdma_load_context) done by channel0 which is the
lowest priority. In other words, calling successfully dmaengine_prep_* in the
next transfer means new normal transfer without any last terminated transfer impact '
normal data transfer on dma non-channel0, such as channel1,2...etc which has higher
priority than channel0, so if channel0 get to run to load context means the 'potential transfer' on the last terminated channel have been done. So no need to move list since
it's no different with normal transfer although sdma_channel_terminate_work may get
to run later to free the last descriptor(only software impact which my patch fix).
Transfer on sdma channel will be split into many small pieces of transfer
(Water-Mark-Level size). Once dma request event acknowledged and scheduled by sdma
core, this sdma channel will start to transfer WML size of data and then schedule out to
other channel. Now the 'potential transfer' alive on terminated channel only happen in
the below two things come out in the same time:
[1].The channel is running to transfer Water-Mark-Level size (sdma side).
[2].The channel is terminated by sdma_disable_channel at the same time(arm side).
Even if it happen, the 'potential transfer' on sdma is very short, because fetching/filling
fifo in WML size(fifo size or half fifo size) is very quick. After that, this channel is
terminated at HW level. Here 1ms from design team just for big safe I think. So I don't
think that's a big deal if memory buffer is available and descriptor are taken care to
no impact on the next transfer as my patch did.

>
>
> @Robin:
> Is it possible to tag the commits for the stable-tree
> Cc: stable@xxxxxxxxxxxxxxx?
Could my patch work in your side? If yes, I will add
Cc: stable@xxxxxxxxxxxxxxx

>
> Best regards and thank you all!
> Benjamin
> Richard