Re: Major page faults counter is zero with MADV_RANDOM

From: wli
Date: Wed Feb 04 2009 - 17:38:48 EST


On Wed, Feb 04, 2009 at 03:09:23PM +0300, Pavel Levshin wrote:
> There is something wrong with page fault counter in mm/filemap.c, when
> using madvise() with MADV_RANDOM flag set. Every page fault in this case
> should be considered major, as it causes disk I/O. But in reality,
> majflt/s in "sar -B" stays zero. I'm using 2.6.24 right now, but this
> piece of code is identical in current kernel.
> The page is read into cache but no counter updated. The same applies
> when kernel decides to disable read ahead due to excessive miss.
> Furthermore, the code makes it possible to read ahead even with
> MADV_RANDOM flag in some cases, as it loops from no_cached_page to
> retry_find. And I am not sure, but does "no read ahead" really mean "no
> page caching"?
> Please CC me if you want me to read your answer.

/* If we don't want any read-ahead, don't bother */
if (VM_RandomReadHint(vma))
goto no_cached_page;

This blows past the only place in filemap_fault() where major faults
are accounted. You have spotted a bug, possibly even several. Major
fault accounting is missing from filemap_fault() in many cases beyond
just MADV_RANDOM. The gotos there are involved enough that it might
take me a few hours to come up with a patch. Maybe someone else who
works with the readahead affairs in there will finish it first.


-- wli
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/