Re: [RESEND] IOZone with transparent huge page cache

From: Dave Hansen
Date: Mon Apr 15 2013 - 19:19:39 EST


On 04/15/2013 11:17 AM, Kirill A. Shutemov wrote:
> I run iozone using mmap files (-B) with different number of threads.
> The test machine is 4s Westmere - 4x10 cores + HT.

How did you run this, exactly? Which iozone arguments? It was run on
ramfs, since that's the only thing that transparent huge page cache
supports right now?

> ** Initial writers **
> threads: 1 2 4 8 16 32 64 128 256
> baseline: 1103360 912585 500065 260503 128918 62039 34799 18718 9376
> patched: 2127476 2155029 2345079 1942158 1127109 571899 127090 52939 25950
> speed-up(times): 1.93 2.36 4.69 7.46 8.74 9.22 3.65 2.83 2.77

I'm a _bit_ surprised that iozone scales _that_ badly especially while
threads<nr_cpus. Is this normal for iozone? What are the units and
metric there, btw?

> Minimal speed up is in 1-thread reverse readers - 23%.
> Maximal is 9.2 times in 32-thread initial writers. It's probably due
> batched radix tree insert - we insert 512 pages a time. It reduces
> mapping->tree_lock contention.

It might actually be interesting to see this at 10, 20, 40, 80, etc...
since that'll actually match iozone threads to CPU cores on your
particular system.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/