Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

From: Linus Torvalds
Date: Sun Nov 06 2005 - 11:15:30 EST




On Sun, 6 Nov 2005, Kyle Moffett wrote:
>
> Hmm, this brings up something that I haven't seen discussed on this list
> (maybe a long time ago, but perhaps it should be brought up again?). What are
> the pros/cons to having a non-physically-linear kernel virtual memory space?

Well, we _do_ actually have that, and we use it quite a bit. Both
vmalloc() and HIGHMEM work that way.

The biggest problem with vmalloc() is that the virtual space is often as
constrained as the physical one (ie on old x86-32, the virtual address
space is the bigger problem - you may have 36 bits of physical memory, but
the kernel has only 30 bits of virtual). But it's quite commonly used for
stuff that wants big linear areas.

The HIGHMEM approach works fine, but the overhead of essentially doing a
software TLB is quite high, and if we never ever have to do it again on
any architecture, I suspect everybody will be pretty happy.

> Would it be theoretically possible to allow some kind of dynamic kernel page
> swapping, such that the _same_ kernel-virtual pointer goes to a different
> physical memory page? That would definitely satisfy the memory hotplug
> people, but I don't know what the tradeoffs would be for normal boxen.

Any virtualization will try to do that, but they _all_ prefer huge pages
if they care at all about performance.

If you thought the database people wanted big pages, the kernel is worse.
Unlike databases or HPC, the kernel actually wants to use the physical
page address quite often, notably for IO (but also for just mapping them
into some other virtual address - the users).

And no standard hardware allows you to do that in hw, so we'd end up doing
a software page table walk for it (or, more likely, we'd have to make
"struct page" bigger).

You could do it today, although at a pretty high cost. And you'd have to
forget about supporting any hardware that really wants contiguous memory
for DMA (sound cards etc). It just isn't worth it.

Real memory hotplug needs hardware support anyway (if only buffering the
memory at least electrically). At which point you're much better off
supporting some remapping in the buffering too, I'm convinced. There's no
_need_ to do these things in software.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/