Re: [PATCH v6 5/8] fsl-dma: change release process of dmadescriptor for supporting async_tx

From: Ira W. Snyder
Date: Mon Aug 06 2012 - 13:51:27 EST


On Mon, Aug 06, 2012 at 06:14:33PM +0800, qiang.liu@xxxxxxxxxxxxx wrote:
> From: Qiang Liu <qiang.liu@xxxxxxxxxxxxx>
>
> Fix the potential risk when enable config NET_DMA and ASYNC_TX.
> Async_tx is lack of support in current release process of dma descriptor,
> all descriptors will be released whatever is acked or no-acked by async_tx,
> so there is a potential race condition when dma engine is uesd by others
> clients (e.g. when enable NET_DMA to offload TCP).
>
> In our case, a race condition which is raised when use both of talitos
> and dmaengine to offload xor is because napi scheduler will sync all
> pending requests in dma channels, it affects the process of raid operations
> due to ack_tx is not checked in fsl dma. The no-acked descriptor is freed
> which is submitted just now, as a dependent tx, this freed descriptor trigger
> BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit().
>
> TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0
> GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4 00000000 00000001
> GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4 ed576d98 00000000
> GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000 ed3015e8 c15a7aa0
> GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0 ef640c30 ecf41ca0
> NIP [c02b048c] async_tx_submit+0x6c/0x2b4
> LR [c02b068c] async_tx_submit+0x26c/0x2b4
> Call Trace:
> [ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
> [ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
> [ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
> [ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
> [ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
> [ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
> [ecf41f40] [c04329b8] md_thread+0x138/0x16c
> [ecf41f90] [c008277c] kthread+0x8c/0x90
> [ecf41ff0] [c0011630] kernel_thread+0x4c/0x68
>
> Another modification in this patch is the change of completed descriptors,
> there is a potential risk which caused by exception interrupt, all descriptors
> in ld_running list are seemed completed when an interrupt raised, it works fine
> under normal condition, but if there is an exception occured, it cannot work
> as our excepted. Hardware should not be depend on s/w list, the right way is
> to read current descriptor address register to find the last completed
> descriptor. If an interrupt is raised by an error, all descriptors in ld_running
> should not be seemed finished, or these unfinished descriptors in ld_running
> will be released wrongly.
>
> A simple way to reproduce,
> Enable dmatest first, then insert some bad descriptors which can trigger
> Programming Error interrupts before the good descriptors. Last, the good
> descriptors will be freed before they are processsed because of the exception
> intrerrupt.
>
> Note: the bad descriptors are only for simulating an exception interrupt.
> This case can illustrate the potential risk in current fsl-dma very well.
>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Vinod Koul <vinod.koul@xxxxxxxxx>
> Cc: Li Yang <leoli@xxxxxxxxxxxxx>
> Signed-off-by: Qiang Liu <qiang.liu@xxxxxxxxxxxxx>
> Signed-off-by: Ira W. Snyder <iws@xxxxxxxxxxxxxxxx>

There are two minor nitpicks below. Other than that, the patch looks
excellent to me.

Ira

> ---
> drivers/dma/fsldma.c | 232 ++++++++++++++++++++++++++++++++++----------------
> drivers/dma/fsldma.h | 17 +++-
> 2 files changed, 174 insertions(+), 75 deletions(-)
>
> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
> index 36490a3..938d8c1 100644
> --- a/drivers/dma/fsldma.c
> +++ b/drivers/dma/fsldma.c
> @@ -472,6 +472,110 @@ static struct fsl_desc_sw *fsl_dma_alloc_descriptor(struct fsldma_chan *chan)
> }
>
> /**
> + * fsldma_clean_completed_descriptor - free all descriptors which
> + * has been completed and acked
> + * @chan: Freescale DMA channel
> + *
> + * This function is used on all completed and acked descriptors.
> + * All descriptors should only be freed in this function.
> + */
> +static void
> +fsldma_clean_completed_descriptor(struct fsldma_chan *chan)
> +{
> + struct fsl_desc_sw *desc, *_desc;
> +
> + /* Run the callback for each descriptor, in order */
> + list_for_each_entry_safe(desc, _desc, &chan->ld_completed, node)
> + if (async_tx_test_ack(&desc->async_tx))
> + fsl_dma_free_descriptor(chan, desc);
> +}
> +
> +/**
> + * fsldma_run_tx_complete_actions - cleanup a single link descriptor
> + * @chan: Freescale DMA channel
> + * @desc: descriptor to cleanup and free
> + * @cookie: Freescale DMA transaction identifier
> + *
> + * This function is used on a descriptor which has been executed by the DMA
> + * controller. It will run any callbacks, submit any dependencies.
> + */
> +static dma_cookie_t fsldma_run_tx_complete_actions(struct fsldma_chan *chan,
> + struct fsl_desc_sw *desc, dma_cookie_t cookie)
> +{
> + struct dma_async_tx_descriptor *txd = &desc->async_tx;
> + struct device *dev = chan->common.device->dev;
> + dma_addr_t src = get_desc_src(chan, desc);
> + dma_addr_t dst = get_desc_dst(chan, desc);
> + u32 len = get_desc_cnt(chan, desc);
> +
> + BUG_ON(txd->cookie < 0);
> +
> + if (txd->cookie > 0) {
> + cookie = txd->cookie;
> +
> + /* Run the link descriptor callback function */
> + if (txd->callback) {
> +#ifdef FSL_DMA_LD_DEBUG
> + chan_dbg(chan, "LD %p callback\n", desc);
> +#endif
> + txd->callback(txd->callback_param);
> + }
> +
> + /* Unmap the dst buffer, if requested */
> + if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
> + if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
> + dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
> + else
> + dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
> + }
> +
> + /* Unmap the src buffer, if requested */
> + if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
> + if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
> + dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
> + else
> + dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
> + }
> + }
> +
> + /* Run any dependencies */
> + dma_run_dependencies(txd);
> +
> + return cookie;
> +}
> +
> +/**
> + * fsldma_clean_running_descriptor - move the completed descriptor from
> + * ld_running to ld_completed
> + * @chan: Freescale DMA channel
> + * @desc: the descriptor which is completed
> + *
> + * Free the descriptor directly if acked by async_tx api, or move it to
> + * queue ld_completed.
> + */
> +static void
> +fsldma_clean_running_descriptor(struct fsldma_chan *chan,
> + struct fsl_desc_sw *desc)
> +{
> + /* Remove from the list of transactions */
> + list_del(&desc->node);

Minor nitpick. Add a blank line here.

> + /*
> + * the client is allowed to attach dependent operations
> + * until 'ack' is set
> + */
> + if (!async_tx_test_ack(&desc->async_tx)) {
> + /*
> + * Move this descriptor to the list of descriptors which is
> + * completed, but still awaiting the 'ack' bit to be set.
> + */
> + list_add_tail(&desc->node, &chan->ld_completed);
> + return;
> + }
> +
> + dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
> +}
> +
> +/**
> * fsl_chan_xfer_ld_queue - transfer any pending transactions
> * @chan : Freescale DMA channel
> *
> @@ -539,51 +643,58 @@ static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
> }
>
> /**
> - * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
> + * fsldma_cleanup_descriptors - cleanup link descriptors which are completed
> + * and move them to ld_completed to free until flag 'ack' is set
> * @chan: Freescale DMA channel
> - * @desc: descriptor to cleanup and free
> *
> - * This function is used on a descriptor which has been executed by the DMA
> - * controller. It will run any callbacks, submit any dependencies, and then
> - * free the descriptor.
> + * This function is used on descriptors which have been executed by the DMA
> + * controller. It will run any callbacks, submit any dependencies, then
> + * free these descriptors if flag 'ack' is set.
> */
> -static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
> - struct fsl_desc_sw *desc)
> +static void fsldma_cleanup_descriptors(struct fsldma_chan *chan)
> {
> - struct dma_async_tx_descriptor *txd = &desc->async_tx;
> - struct device *dev = chan->common.device->dev;
> - dma_addr_t src = get_desc_src(chan, desc);
> - dma_addr_t dst = get_desc_dst(chan, desc);
> - u32 len = get_desc_cnt(chan, desc);
> + struct fsl_desc_sw *desc, *_desc;
> + dma_cookie_t cookie = 0;
> + dma_addr_t curr_phys = get_cdar(chan);
> + int seen_current = 0;
>
> - /* Run the link descriptor callback function */
> - if (txd->callback) {
> -#ifdef FSL_DMA_LD_DEBUG
> - chan_dbg(chan, "LD %p callback\n", desc);
> -#endif
> - txd->callback(txd->callback_param);
> - }
> + fsldma_clean_completed_descriptor(chan);
>
> - /* Run any dependencies */
> - dma_run_dependencies(txd);
> + /* Run the callback for each descriptor, in order */
> + list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
> + /*
> + * do not advance past the current descriptor loaded into the
> + * hardware channel, subsequent descriptors are either in
> + * process or have not been submitted
> + */
> + if (seen_current)
> + break;
>
> - /* Unmap the dst buffer, if requested */
> - if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
> - if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
> - dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
> - else
> - dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
> - }
> + /*
> + * stop the search if we reach the current descriptor and the
> + * channel is busy
> + */
> + if (desc->async_tx.phys == curr_phys) {
> + seen_current = 1;
> + if (!dma_is_idle(chan))
> + break;
> + }

I wonder if this is better:

if (desc->async_tx.phys == get_cdar(chan)) {
seen_current = 1;
if (!dma_is_idle(chan))
break;
}

Your version works fine, it just might stop earlier than necessary. This
is just a nitpick.

>
> - /* Unmap the src buffer, if requested */
> - if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
> - if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
> - dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
> - else
> - dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
> + cookie = fsldma_run_tx_complete_actions(chan, desc, cookie);
> +
> + fsldma_clean_running_descriptor(chan, desc);
> }
>
> - fsl_dma_free_descriptor(chan, desc);
> + /*
> + * Start any pending transactions automatically
> + *
> + * In the ideal case, we keep the DMA controller busy while we go
> + * ahead and free the descriptors below.
> + */
> + fsl_chan_xfer_ld_queue(chan);
> +
> + if (cookie > 0)
> + chan->common.completed_cookie = cookie;
> }
>
> /**
> @@ -654,8 +765,10 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
>
> chan_dbg(chan, "free all channel resources\n");
> spin_lock_irqsave(&chan->desc_lock, flags);
> + fsldma_cleanup_descriptors(chan);
> fsldma_free_desc_list(chan, &chan->ld_pending);
> fsldma_free_desc_list(chan, &chan->ld_running);
> + fsldma_free_desc_list(chan, &chan->ld_completed);
> spin_unlock_irqrestore(&chan->desc_lock, flags);
>
> dma_pool_destroy(chan->desc_pool);
> @@ -893,6 +1006,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
> /* Remove and free all of the descriptors in the LD queue */
> fsldma_free_desc_list(chan, &chan->ld_pending);
> fsldma_free_desc_list(chan, &chan->ld_running);
> + fsldma_free_desc_list(chan, &chan->ld_completed);
> chan->idle = true;
>
> spin_unlock_irqrestore(&chan->desc_lock, flags);
> @@ -956,11 +1070,15 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
> enum dma_status ret;
> unsigned long flags;
>
> - spin_lock_irqsave(&chan->desc_lock, flags);
> ret = dma_cookie_status(dchan, cookie, txstate);
> + if (ret == DMA_SUCCESS)
> + return ret;
> +
> + spin_lock_irqsave(&chan->desc_lock, flags);
> + fsldma_cleanup_descriptors(chan);
> spin_unlock_irqrestore(&chan->desc_lock, flags);
>
> - return ret;
> + return dma_cookie_status(dchan, cookie, txstate);
> }
>
> /*----------------------------------------------------------------------------*/
> @@ -1037,52 +1155,19 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
> static void dma_do_tasklet(unsigned long data)
> {
> struct fsldma_chan *chan = (struct fsldma_chan *)data;
> - struct fsl_desc_sw *desc, *_desc;
> - LIST_HEAD(ld_cleanup);
> unsigned long flags;
>
> chan_dbg(chan, "tasklet entry\n");
>
> spin_lock_irqsave(&chan->desc_lock, flags);
>
> - /* update the cookie if we have some descriptors to cleanup */
> - if (!list_empty(&chan->ld_running)) {
> - dma_cookie_t cookie;
> -
> - desc = to_fsl_desc(chan->ld_running.prev);
> - cookie = desc->async_tx.cookie;
> - dma_cookie_complete(&desc->async_tx);
> -
> - chan_dbg(chan, "completed_cookie=%d\n", cookie);
> - }
> -
> - /*
> - * move the descriptors to a temporary list so we can drop the lock
> - * during the entire cleanup operation
> - */
> - list_splice_tail_init(&chan->ld_running, &ld_cleanup);
> -
> /* the hardware is now idle and ready for more */
> chan->idle = true;
>
> - /*
> - * Start any pending transactions automatically
> - *
> - * In the ideal case, we keep the DMA controller busy while we go
> - * ahead and free the descriptors below.
> - */
> - fsl_chan_xfer_ld_queue(chan);
> - spin_unlock_irqrestore(&chan->desc_lock, flags);
> -
> - /* Run the callback for each descriptor, in order */
> - list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
> + /* Run all cleanup for descriptors which have been completed */
> + fsldma_cleanup_descriptors(chan);
>
> - /* Remove from the list of transactions */
> - list_del(&desc->node);
> -
> - /* Run all cleanup for this descriptor */
> - fsldma_cleanup_descriptor(chan, desc);
> - }
> + spin_unlock_irqrestore(&chan->desc_lock, flags);
>
> chan_dbg(chan, "tasklet exit\n");
> }
> @@ -1264,6 +1349,7 @@ static int __devinit fsl_dma_chan_probe(struct fsldma_device *fdev,
> spin_lock_init(&chan->desc_lock);
> INIT_LIST_HEAD(&chan->ld_pending);
> INIT_LIST_HEAD(&chan->ld_running);
> + INIT_LIST_HEAD(&chan->ld_completed);
> chan->idle = true;
>
> chan->common.device = &fdev->common;
> diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
> index f5c3879..a58275a 100644
> --- a/drivers/dma/fsldma.h
> +++ b/drivers/dma/fsldma.h
> @@ -138,8 +138,21 @@ struct fsldma_chan {
> char name[8]; /* Channel name */
> struct fsldma_chan_regs __iomem *regs;
> spinlock_t desc_lock; /* Descriptor operation lock */
> - struct list_head ld_pending; /* Link descriptors queue */
> - struct list_head ld_running; /* Link descriptors queue */
> + /*
> + * Descriptors which are queued to run, but have not yet been
> + * submitted to the hardware for execution
> + */
> + struct list_head ld_pending;
> + /*
> + * Descriptors which are currently being executed by the hardware
> + */
> + struct list_head ld_running;
> + /*
> + * Descriptors which have finished execution by the hardware. These
> + * descriptors have already had their cleanup actions run. They are
> + * waiting for the ACK bit to be set by the async_tx API.
> + */
> + struct list_head ld_completed; /* Link descriptors queue */
> struct dma_chan common; /* DMA common channel */
> struct dma_pool *desc_pool; /* Descriptors pool */
> struct device *dev; /* Channel device */
> --
> 1.7.5.1
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@xxxxxxxxxxxxxxxx
> https://lists.ozlabs.org/listinfo/linuxppc-dev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/