RE: pcm|dmaengine|imx-sdma race condition on i.MX6

From: Robin Gong
Date: Tue Aug 18 2020 - 06:41:42 EST


On 2020/08/17 19:38 Benjamin Bara - SKIDATA <Benjamin.Bara@xxxxxxxxxxx> wrote:
> > -----Original Message-----
> > From: Robin Gong <yibin.gong@xxxxxxx>
> > Sent: Montag, 17. August 2020 11:23
> > busy_wait is not good I think, would you please have a try with the
> > attached patch which is based on
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml
> > .org%2Flkml%2F2020%2F8%2F11%2F111&amp;data=02%7C01%7Cyibin.gong
> %40nxp.
> >
> com%7C96a66f37340648e998f108d842a2057e%7C686ea1d3bc2b4c6fa92cd99
> c5c301
> >
> 635%7C0%7C0%7C637332610926324334&amp;sdata=vn80kNlIY%2BB9v9cOlXJ
> patNkn
> > YAMtVx6v7yhfvAE%2FRM%3D&amp;reserved=0? The basic idea is to keep
> the
> > freed descriptor into another list for freeing in later
> > terminate_worker instead of freeing directly all in terminate_worker
> > by vchan_get_all_descriptors which may break next descriptor coming
> > soon
>
> The idea sounds good, but with this attempt we are still not sure that the 1ms
> (the ultimate reason why this is a problem) is awaited between DMA disabling
> and re-enabling.
The original 1ms delay is for ensuring sdma channel stop indeed, otherwise, sdma may
still access IPs's fifo like uart/sai... during last Water-Mark-Level size transfer. The worst
is some IP such as uart may lead to sdma hang after UCR2_RXEN/ UCR2_TXEN disabled
("Timeout waiting for CH0 ready" would be caught). So I would suggest synchronizing
dma after channel terminated. But for PCM system's limitation, seems no choice but
terminate async. If sdma could access audio fifo without hang after PCM driver terminate
dma channel and rx/tx data buffers are not illegal, maybe 1ms is not a must
because only garbage data harmless touched by sdma and ignored by PCM driver.
Current sdma driver with my patches could ensure below:
-- The last terminated transfer will be stopped before the next quick transfer start.
because load context(sdma_load_context) done by channel0 which is the
lowest priority. In other words, calling successfully dmaengine_prep_* in the
next transfer means new normal transfer without any last terminated transfer
impact.
-- No potential interrupt after terminated could be handled before next transfer
start because 'sdmac->desc' has been set NULL in sdma_terminate_all.

>
> If we are allowed to leave the atomic PCM context on each trigger, synchronize
> the DMA and then enter it back again, everything is fine.
> This might be the most performant and elegant solution.
> However, since we are in an atomic context for a reason, it might not be
> wanted by the PCM system that the DMA termination completion of the
> previous context happens within the next call, but we are not sure about that.
> In this case, a busy wait is not a good solution, but a necessary one, or at least
> the only valid solution we are aware of.
>
> Anyhow, based on my understanding, either the start or the stop trigger has to
> wait the 1ms (or whats left of it).