From: Eric Dumazet
Date: Thu May 20 2010 - 23:26:34 EST

Le vendredi 21 mai 2010 Ã 02:05 +0200, Andrea Arcangeli a Ãcrit :
> If you're running scientific applications, JVM or large gcc builds
> (see attached patch for gcc), and you want to run from 2.5% faster for
> kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> (depending on the workload), you should apply and run the transparent
> hugepage support on your systems.
> Awesome results have already been posted on lkml, if you test and
> benchmark it, please provide any positive/negative real-life result on
> lkml (or privately to me if you prefer). The more testing the better.

Interesting !

Did you tried to change alloc_large_system_hash() to use hugepages for
very large allocations ? We currently use vmalloc() on NUMA machines...

Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

0xffffc90000003000-0xffffc90001004000 16781312 alloc_large_system_hash+0x1d8/0x280 pages=4096 vmalloc vpages N0=2048 N1=2048
0xffffc9000100f000-0xffffc90001810000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
0xffffc90005882000-0xffffc90005c83000 4198400 alloc_large_system_hash+0x1d8/0x280 pages=1024 vmalloc vpages N0=512 N1=512
0xffffc90005c84000-0xffffc90006485000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024

