Re: Random file I/O regressions in 2.6

From: Andrew Morton
Date: Tue May 04 2004 - 01:34:54 EST


Ram Pai <linuxram@xxxxxxxxxx> wrote:
>
> Sorry, If I am saying this again. I have checked the behaviour of the
> readahead code using my user level simulator as well as running some
> DSS benchmark and iozone benchmark. It generates a steady stream of
> large i/o for large-random-reads and should not exhibit the bad behavior
> that we are seeing. I feel this bad behavior is because of interleaved
> access by multiple thread.

you're right - the benchmark has multiple threads issuing concurrent
pread()s against the same fd. For some reason this mucks up the 2.6
readahead state more than 2.4's.

Putting a semaphore around do_generic_file_read() or maintaining the state
as below fixes it up.

I wonder if we should bother fixing this? I guess as long as the app is
using pread() it is a legitimate thing to be doing, so I guess we should...



--- 25/mm/filemap.c~readahead-seralisation 2004-05-03 23:14:43.399947720 -0700
+++ 25-akpm/mm/filemap.c 2004-05-03 23:14:43.404946960 -0700
@@ -612,7 +612,7 @@ EXPORT_SYMBOL(grab_cache_page_nowait);
* - note the struct file * is only passed for the use of readpage
*/
void do_generic_mapping_read(struct address_space *mapping,
- struct file_ra_state *ra,
+ struct file_ra_state *_ra,
struct file * filp,
loff_t *ppos,
read_descriptor_t * desc,
@@ -622,6 +622,7 @@ void do_generic_mapping_read(struct addr
unsigned long index, offset;
struct page *cached_page;
int error;
+ struct file_ra_state ra = *_ra;

cached_page = NULL;
index = *ppos >> PAGE_CACHE_SHIFT;
@@ -644,13 +645,13 @@ void do_generic_mapping_read(struct addr
}

cond_resched();
- page_cache_readahead(mapping, ra, filp, index);
+ page_cache_readahead(mapping, &ra, filp, index);

nr = nr - offset;
find_page:
page = find_get_page(mapping, index);
if (unlikely(page == NULL)) {
- handle_ra_miss(mapping, ra, index);
+ handle_ra_miss(mapping, &ra, index);
goto no_cached_page;
}
if (!PageUptodate(page))
@@ -752,6 +753,8 @@ no_cached_page:
goto readpage;
}

+ *_ra = ra;
+
*ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset;
if (cached_page)
page_cache_release(cached_page);

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/