Re: [git pull] drm patches for 2.6.27-rc1
From: Jon Smirl
Date: Sat Oct 18 2008 - 18:47:40 EST
On Sat, Oct 18, 2008 at 6:32 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> * Keith Packard <keithp@xxxxxxxxxx> wrote:
>> On Sat, 2008-10-18 at 22:37 +0200, Ingo Molnar wrote:
>> > But i think the direction of the new GEM code is subtly wrong here,
>> > because it tries to manage memory even on 64-bit systems. IMO it
>> > should just map the _whole_ graphics aperture (non-cached) and be
>> > done with it. There's no faster method at managing pages than the
>> > CPU doing a TLB fill from pagetables.
>> Yeah, we're stuck thinking that we "can't" map the aperture because
>> it's too large, but with a 64-bit kernel, we should be able to keep it
>> mapped permanently.
>> Of course, the io_reserve_pci_resource and io_map_atomic functions
>> could do precisely that, as kmap_atomic does on non-HIGHMEM systems
> okay, so basically what we need is a shared API that does per page
> kmap_atomic on 32-bit, and just an ioremap() on 64-bit. I had the
> impression that you were suggesting to extend kmap_atomic() to 64-bit -
> which would be wrong.
Is it possible to use a segment register to map the whole aperture on
32b? A segment register might allow common code on 64b/32b by
eliminating the need to move the mapping window around.
> So, in terms of the 4 APIs you suggest:
> struct io_mapping *io_reserve_pci_resource(struct pci_dev *dev,
> int bar,
> int prot);
> void io_mapping_free(struct io_mapping *mapping);
> void *io_map_atomic(struct io_mapping *mapping, unsigned long pfn);
> void io_unmap_atomic(struct io_mapping *mapping, unsigned long pfn);
> here is what we'd do on 64-bit:
> - io_reserve_pci_resource() would just do an ioremap(), and would save
> the ioremap-ed memory into struct io_mapping.
> - io_mapping_free() does the iounmap()
> - io_map_atomic(): just arithmetics, returns mapping->base + pfn - no
> TLB activities at all.
> - io_unmap_atomic(): NOP.
> it's as fast as it gets: zero overhead in essence. Note that it's also
> shared between all CPUs and there's no aliasing trouble.
> And we could make it even faster: if you think we could even use 2MB
> TLBs for the _linear_ ioremap()s here, hm? There's plenty of address
> space on 64-bit so we can align to 2MB just fine - and aperture sizes
> are 2MB sized anyway.
> Or we could go one step further and install these aperture mappings into
> the _kernel linear_ address space. That would be even faster, because
> we'd have a constant offset. We have the (2MB mappings aware) mechanism
> for that already. (Yinghai Cc:-ed - he did a lot of great work to
> generalize this area.)
> (In fact if we installed it into the linear kernel address space, and if
> the aperture is 1GB aligned, we will automatically use gbpages for it.
> Were Intel to support gbpages in the future ;-)
> the _real_ remapping in a graphics aperture happens on the GPU level
> anyway, you manage an in-RAM GPU pagetable that just works like an
> IOMMU, correct?
> on 32-bit we'd have what you use in the GEM code today:
> - io_reserve_pci_resource(): a NOP in essence
> - io_mapping_free(): a NOP
> - io_map_atomic(): does a kmap_atomic(pfn)
> - io_unmap_atomic(): does a kunmap_atomic(pfn)
> so on 32-bit we have the INVLPG TLB overhead and preemption restrictions
> - but we knew that. We'd have to allow atomic_kmap() on non-highmem as
> well but that's fair.
> Mind sending patches for this? :-)
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/