Re: Large stack usage in fs code (especially for PPC64)

From: Linus Torvalds
Date: Mon Nov 17 2008 - 21:08:50 EST




On Tue, 18 Nov 2008, Paul Mackerras wrote:
>
> Also, you didn't respond to my comments about the purely software
> benefits of a larger page size.

I realize that there are benefits. It's just that the downsides tend to
swamp the upsides.

The fact is, Intel (and to a lesser degree, AMD) has shown how hardware
can do good TLB's with essentially gang lookups, giving almost effective
page sizes of 32kB with hardly any of the downsides. Couple that with
low-latency fault handling (for not when you miss in the TLB, but when
something really isn't in the page tables), and it seems to be seldom the
biggest issue.

(Don't get me wrong - TLB's are not unimportant on x86 either. But on x86,
things are generally much better).

Yes, we could prefill the page tables and do other things, and ultimately
if you don't need to - by virtue of big pages, some loads will always
benefit from just making the page size larger.

But the people who advocate large pages seem to never really face the
downsides. They talk about their single loads, and optimize for that and
nothing else. They don't seem to even acknowledge the fact that a 64kB
page size is simply NOT EVEN REMOTELY ACCEPTABLE for other loads!

That's what gets to me. These absolute -idiots- talk about how they win 5%
on some (important, for them) benchmark by doing large pages, but then
ignore the fact that on other real-world loads they lose by sevaral
HUNDRED percent because of the memory fragmentation costs.

(And btw, if they win more than 5%, it's because the hardware sucks really
badly).

THAT is what irritates me.

What also irritates me is the ".. but AIX" argument. The fact is, the AIX
memory management is very tightly tied to one particular broken MMU model.
Linux supports something like thirty architectures, and while PPC may be
one of the top ones, it is NOT EVEN CLOSE to be really relevant.

So ".. but AIX" simply doesn't matter. The Linux VM has other priorities.

And I _guarantee_ that in general, in the high-volume market (which is
what drives things, like it or not), page sizes will not be growing. In
that market, terabytes of RAM is not the primary case, and small files
that want mmap are one _very_ common case.

To make things worse, the biggest performance market has another vendor
that hasn't been saying ".. but AIX" for the last decade, and that
actually listens to input. And, perhaps not incidentally, outperforms the
highest-performance ppc64 chips mostly by a huge margin - while selling
their chips for a fraction of the price.

I realize that this may be hard to accept for some people. But somebody
who says "... but AIX" should be taking a damn hard look in the mirror,
and ask themselves some really tough questions. Because quite frankly, the
"..but AIX" market isn't the most interesting one.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/