Re: [PATCH v4 0/3] dma-mapping, powerpc, nvme: introduce the DMA_ATTR_NO_WARN attribute

From: Mauricio Faria de Oliveira
Date: Thu Aug 04 2016 - 20:18:03 EST


Andrew,

On 08/04/2016 07:01 PM, Andrew Morton wrote:
It would help to have seen an example of the error message - please
always quote such things when fixing bugs.

Indeed; okay.

The error messages are several blocks like this one:

ppc_iommu_map_sg: 11784 callbacks suppressed
nvme 0001:01:00.0: iommu_alloc failed, tbl c00001965c5ca400 vaddr c000018faa7b0000 npages 16
nvme 0001:01:00.0: iommu_alloc failed, tbl c00001965c5ca400 vaddr c000018faa9b0000 npages 16
<repeat>

I assume the warnings are coming via nvme_map_data()'s call to
blk_rq_map_sg()? [snip]

If I understand the point in the question correctly -- actually not,
the warnings are coming via:

nvme_map_data()
-> dma_map_sg[_attrs]()
-> dma_map_ops.map_sg()
(dma_map_ops = dma_iommu_ops @ arch/powerpc/kernel/iommu.c)
-> dma_iommu_map_sg()
-> ppc_iommu_map_sg() /* as seen above */

And from what I could observe, the blk_rq_map_sg() path doesn't end
up in there.

[snip] An alternative (and more idiomatic) fix would be to
change the blk_rq_map_sg() interface to permit passing down some
foo_NOWARN flag and propagating that down the stack into
ppc_iommu_map_sg(). Was this approach evaluated? I suspect it might
be messy.

I see; I haven't evaluated that, but agree with you it might be messy.

As far as I can see, in order to pass something to blk_rq_map_sg() and
have it eventually make into ppc_iommu_map_sg(), that something should
be present in the scatterlist -- which seems to be what's common/passed
to both blk_rq_map_sg() (the interface point proposed) and dma_map_sg() (which is the function which reaches ppc_iommu_map_sg() down the chain).

It seems a bit hidden, and (if I got the suggestion right), it doesn't
seem to be in the scope of scatterlist to contain such a flag.

One point of the patches is make the attribute visible/explicit; I see
it can be inconvenient sometimes, but it allows for a clear / evident
difference between dma_map_sg() calls which are (not) OK with failures.

(for example, the 2 calls in nvme_map_data() - they can return either
BLK_MQ_RQ_QUEUE_BUSY or BLK_MQ_RQ_QUEUE_ERROR - so the former is OK.)

Does that make sense?

Thanks for the review.

--
Mauricio Faria de Oliveira
IBM Linux Technology Center