Re: BUG at drivers/iommu/amd_iommu.c:1436!

From: Mark Hounschell
Date: Tue Nov 22 2016 - 14:54:13 EST


On 11/22/2016 03:45 AM, Joerg Roedel wrote:
On Mon, Nov 21, 2016 at 04:47:59PM -0500, Mark Hounschell wrote:
OK, I did get this message before the reported BUG message.

gpiohsd gpiohsd: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0xffffffffffee8000] [size=8192 bytes]

But I've verified that the dma_addr_t that I get for the alloc, and
also use for the free is 0x00000000ffee8000 in this case? Is device
"address=0xffffffffffee8000" in that message a bug in the message or
do we have a sign extended address problem? It seems strange to me,
I've never seen a dma_addr_t given, when using the iommu, that
high. In the past I've seen them as usually 0x00xxxxxx?

I have also verified that simply changing from
pci_alloc/free_consistent to the newer DMA API fixes my issue and I
get no such messages.

Yes, this looks like a sign-extension bug somewhere. But its not in the
amd-iommu driver, because dma-debug also sees it. And from what I can
tell the dma-api interface seems to be fine. It consistently uses
dma_addr_t to pass these values around.

Where can I find the source of the failing code? I need exactly the code
version that triggers the problem.



I certainly don't have a problem sending you the code but I'm sure you have better things to do than scour over some out of kernel GPL driver code. I see many many users of pci_alloc/free in kernel so it can't be broken as badly as it appears to me here. I'm going to just go ahead and convert this section of code to use the newer DMA API and be done with it.

It appears pci_alloc/free_consistent is going to be removed completely soon anyway.

Thanks
Mark