Re: swiotlb_alloc_coherent: allocated memory is out of range for device

From: Takashi Iwai
Date: Wed Oct 22 2008 - 09:33:06 EST


At Wed, 22 Oct 2008 22:13:39 +0900,
FUJITA Tomonori wrote:
>
> On Wed, 22 Oct 2008 14:06:48 +0200
> Takashi Iwai <tiwai@xxxxxxx> wrote:
>
> > At Wed, 22 Oct 2008 20:29:24 +0900,
> > FUJITA Tomonori wrote:
> > >
> > > On Wed, 22 Oct 2008 12:53:58 +0200
> > > Takashi Iwai <tiwai@xxxxxxx> wrote:
> > >
> > > > At Sun, 19 Oct 2008 12:09:32 +0200,
> > > > Sven Schnelle wrote:
> > > > >
> > > > > Hi List,
> > > > >
> > > > > my kernel dies while probing parport with the following last words:
> > > > >
> > > > > [ 3.672199] parport_pc 00:0b: reported by Plug and Play ACPI
> > > > > [ 3.677969] parport0: PC-style at 0x378 (0x778), irq 7, dma 3 [PCSPP,TRISTATE,COMPAT,EPP,ECP,DMA]
> > > > > [ 3.687691] hwdev DMA mask = 0x0000000000ffffff, dev_addr = 0x0000000020000000
> > > > > [ 3.694916] Kernel panic - not syncing: swiotlb_alloc_coherent: allocated memory is out of range for device
> > > > >
> > > > > I haven't started a bisection yet, but this seems to be introduced
> > > > > somewhere between 2.6.26 and 2.6.27, at least 2.6.26 was working without
> > > > > problems. The dmesg log + config was obtained from a kernel compiled
> > > > > from git on 10/16/2008.
> > > >
> > > > This bug hits me, too. Looks like swiotlb assumes that the alloc caller
> > > > must set GFP_DMA appropriately by itself since GFP_DMA hack was
> > > > removed. The patch below should fix this particular case.
> > >
> > > This happens with 2.6.27, right? GFP_DMA hack was removed post
> > > 2.6.27. What kernel version do you hit this problem?
> >
> > 2.6.27 works fine, at least on my machine.
> > Likely a post-2.6.27 regression.
>
> Ok, it makes sense because I don't see any major changes to swiotlb
> between 2.6.26 and 2.6.27.
>
>
> > > Post 2.6.27, x86's alloc_coherent works a bit differently, but neither
> > > require the caller set to GFP flag. arch/x86/kernel/pci-dma.c does
> > > with 2.6.27 and asm-x86/dma-mapping.h does with post 2.6.27.
> > >
> > >
> > > > HOWEVER: the fundamental problem appears to be in swiotlb itself.
> > > > It assumes that iotlb pages are in DMA area. But, in this case, the
> > > > driver sets 24bit DMA (as of PnP) while iotlb pages are allocated
> > > > under 32bit DMA via alloc_bootmem_low_pages(). This doesn't work, of
> > > > course.
> > >
> > > If a device has 24bit dma mask, alloc_coherent is supposed to use
> > > GFP_DMA.
> >
> > Yes. But what happens if __get_free_pages() fails? Then you get the
> > same problem.
>
> Yeah, but __get_free_pages() with GFP_DMA fails, what can we do in
> such case?

Just return NULL instead of a kernel panic.

> You think that it's a good idea to change swiotlb to
> allocates < 16MB iotlb pages? I'm not sure it's worth to do
> that. 24bit dma mask devices are disappearing.

Seconded, I don't think it's worth to allocate iotlb pages in <16MB,
too. The case with GFP_DMA is somewhat special and mostly for
obsolete devices.

> About the bug that you hit, I suspect that dma_map_coherent() in
> asm-x86/dma-mapping.h doesn't set gfp flags correctly.
>
> dma_map_coherent() calls swiotlb_alloc_coherent with the flags GFP_DMA
> set? parport driver set dev->coherent_dma_mask properly?

The parport driver itself passes always GFP_KERNEL. So I added
GFP_DMA in my initial patch as a workaround.

The device of the parport driver is passed by another, and the driver
doesn't care about DMA mask by itself. In the panic case, it seems
from PnP, and PnP bus has set it to 24bit beforehand, I guess.

> > > > So, even adding GFP_DMA works mostly, it has still potentially
> > > > breakage when you can't get the page and fall back to iotlb pages,
> > > > just like the panic above.
> > > >
> > > > Also, the removal of GFP_DMA hack is a bad idea. For example, if a
> > > > device requires 28bit DMA mask, it doesn't set always GFP_DMA for
> > > > allocation because pages in ZONE_NORMAL may be inside that DMA mask.
> > > > Normal allocators allow this behavior but swiotlb allocator doesn't.
> > > > (Correct me if I'm wrong here -- I haven't followed much the recent
> > > > changes.)
> > >
> > > 28bit DMA mask is supposed to be handled properly. Firstly, we try
> > > with DMA_32BIT_MASK and if an allocated address is not fit for 28bit
> > > mask, we try GFP_DMA again.
> >
> > Yep, dma_generic_alloc_coherent() works like that for ages.
> > My point is about swiotlb_alloc_coherent(), and I don't see the
> > relevant code there...
>
> Oops, you are right. swiotlb doesn't try again with GFP_DMA now. Joerg
> changed the GFP_DMA retry mechanism work only for pci-nommu.c It broke
> GART IOMMU and x86's swiotlb. I modified dma_generic_alloc_coherent to
> work with pci-nommu and GART. I promised Ingo to fix swiotlb too but I
> forgot about it.
>
> Sorry, I'll fix this soon but your case (28bit mask) is supposed to
> work without the GFP_DMA retry mechanism. As I wrote above, I suspect
> that dma flag is not set correctly.

Yes, the parport case should be fixed by passing GFP_DMA.
But, the unconditional panic should be still fixed.

> > > > Last but not least, I think panic() in allocation error path is too
> > > > strict. Usually returning NULL isn't always fatal error, so give some
> > > > more chance to debug, e.g. by calling WARN() (or whatever) instead of
> > > > panic().
> > >
> > > Yeah, this was discussed several times. The problem is that many
> > > drivers assume that dma mapping operations, map_single, map_sg, and
> > > map_coherent, always succeed and doesn't even check the errors. So we
> > > have some panic() in IOMMU drivers to prevent really bad events like
> > > data corruption.
> >
> > Well, but also for alloc_coherent()? Returning NULL from
> > dma_alloc_coherent() is really no fatal error. If the caller doesn't
> > check the return value, then it's a more serious bug, I'd say.
>
> Probably, the majority check dma_alloc_coherent failure. I'll check
> this later to remove panic() in alloc_coherent in IOMMUs.

Thanks!


Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/