Re: [pagevec] resize pagevec to O(lg(NR_CPUS))

From: Andrew Morton
Date: Sun Sep 12 2004 - 02:48:01 EST


William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
>
> A large stream of faults to map in a file will blow L1 caches of the
> sizes you've mentioned at every kernel/user context switch. 256 distinct
> cachelines will very easily be referenced between faults. MAP_POPULATE
> and mlock() don't implement batching for either ->page_table_lock or
> ->tree_lock, so the pagevec point is moot in pagetable instantiation
> codepaths (though it probably shouldn't be).

Instantiation via normal fault-in becomes lock-intensive once you have
enough CPUs. At low CPU count the page zeroing probably preponderates.

> O_DIRECT writes and msync(..., ..., MS_SYNC) will use pagevecs on
> ->tree_lock in a rapid-fire process-triggerable manner. Almost all
> uses of pagevecs for ->lru_lock outside the scanner that I'm aware
> of are not rapid-fire in nature (though there probably should be some).

pagetable teardown (munmap, mremap, exit) is the place where pagevecs help
->lru_lock. And truncate.

> IMHO pagevecs are somewhat underutilized.
>

Possibly. I wouldn't bother converting anything unless a profiler tells
you to though.

> Sorry, 4*lg(NR_CPUS) is 64 when lg(NR_CPUS) = 16, or 65536 cpus. 512x
> Altixen would have 4*lg(512) = 4*9 = 36. The 4*lg(NR_CPUS) sizing was
> rather conservative on behalf of users of stack-allocated pagevecs.

It's pretty simple to diddle PAGEVEC_SIZE, run a few benchmarks. If that
makes no difference then the discussion is moot. If it makes a significant
difference then more investigation is warranted.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/