Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY

From: Nick Piggin
Date: Thu Nov 27 2008 - 08:08:28 EST


On Thu, Nov 27, 2008 at 01:28:41AM -0800, Mike Waychison wrote:
> >Hmm. How quantifiable is the benefit? Does it actually matter that you
> >can read the proc file much faster? (this is for some automated workload
> >management daemon or something, right?)
>
> Correct. I don't recall the numbers from the pathelogical cases we were
> seeing, but iirc, it was on the order of 10s of seconds, likely
> exascerbated by slower than usual disks. I've been digging through my
> inbox to find numbers without much success -- we've been using a variant
> of this patch since 2.6.11.
>
> Török however identified mmap taking on the order of several
> milliseconds due to this exact problem:
>
> http://lkml.org/lkml/2008/9/12/185

Turns out to be a different problem.


> >Would it be possible to reduce mmap()/munmap() activity? eg. if it is
> >due to a heap memory allocator, then perhaps do more batching or set
> >some hysteresis.
>
> I know our tcmalloc team had made great strides to reduce mmap_sem
> contention for the heap, but there are various other bits of the stack
> that really want to mmap files..
>
> We generally try to avoid such things, but sometimes it a) can't be
> easily avoided (third party libraries for instance) and b) when it hits
> us, it affects the overall health of the machine/cluster (the monitoring
> daemons get blocked, which isn't very healthy).

Are you doing appropriate posix_fadvise to prefetch in the files before
faulting, and madvise hints if appropriate?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/