Re: [pagevec] resize pagevec to O(lg(NR_CPUS))

From: William Lee Irwin III
Date: Mon Sep 13 2004 - 22:22:14 EST

Next message: William Lee Irwin III: "Re: [RFT 2.6.9-rc1 alpha sys_alcor.c] [1/2] convert pci_find_device to pci_get_device"
Previous message: Gene Heskett: "2.6.9-rc1-mm5, ehci stuff gone"
In reply to: Andrew Morton: "Re: [pagevec] resize pagevec to O(lg(NR_CPUS))"
Next in thread: William Lee Irwin III: "Re: [pagevec] resize pagevec to O(lg(NR_CPUS))"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sun, Sep 12, 2004 at 12:42:56AM -0700, Andrew Morton wrote:
>>> Instantiation via normal fault-in becomes lock-intensive once you have
>>> enough CPUs. At low CPU count the page zeroing probably preponderates.

William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
>> But that's mm->page_table_lock, for which pagevecs aren't used,

On Mon, Sep 13, 2004 at 07:57:31PM -0700, Andrew Morton wrote:
> It is zone->lru_lock and pagevecs are indeed used. See
> do_anonymous_page->lru_cache_add_active.

zone->lru_lock is acquired there, but I believe locks in the mm are the
dominant overhead in such scenarios.

William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
>> mlock() is the case I have in hand, though I've only heard of it being
>> problematic on vendor kernels. MAP_POPULATE is underutilized in
>> userspace thus far, so I've not heard anything about it good or bad.

On Mon, Sep 13, 2004 at 07:57:31PM -0700, Andrew Morton wrote:
> If you're referring to mlock() of an anonymous vma then that should all go
> through do_anonymous_page->lru_cache_add_active anyway?

It's a shared memory segment. It's not clear that it was lock
contention per se that was the issue in this case, merely a lot of
(unexpected) cpu overhead. I'm going to check in with the reporter of
the issue to see whether this still an issue in 2.6.x. The vendor
solution was to ask the app politely not to mlock the shm segments; in
2.6.x this can be addressed if it's still an issue. I believe a fair
amount of the computational expense may be attributed to atomic
operations (not lock contention per se) that may be reduced by batching.
i.e. the pagevecs would merely be used to reduce the number of slow
atomic operations and to avoid walking trees from the top-down for each
element, not to address lock contention.

-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: William Lee Irwin III: "Re: [RFT 2.6.9-rc1 alpha sys_alcor.c] [1/2] convert pci_find_device to pci_get_device"
Previous message: Gene Heskett: "2.6.9-rc1-mm5, ehci stuff gone"
In reply to: Andrew Morton: "Re: [pagevec] resize pagevec to O(lg(NR_CPUS))"
Next in thread: William Lee Irwin III: "Re: [pagevec] resize pagevec to O(lg(NR_CPUS))"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]