Yes - and 5000 is as high as that parameter will go, so it won't keep
up with a heavier load. I'm a little worried about your vmstat
numbers still, the repeated 2500 figure indicates that the per-cycle
write limit is getting halved somewhere.
There was an age limit (35 secs, tunable) in the previous (2.2.7)
version of this code. I removed it accidentally. Your self-tuning
ideas are interesting, but for now let's just put back the age limit.
The appended patch *should* correct the problem. I have not even
compiled it; if I get a chance, I'll try later tonight. Linus, since
you put my patch into 2.2.8, this needs to be considered for 2.2.9 and
2.3.1. And could you please look at what I've done and make sure I
haven't missed something else? I don't know the buffer cache all that
well, and I was not expecting the patch to get incorporated without a
few rounds of testing first.
zw
--- buffer.c.228 Wed May 12 18:23:34 1999
+++ buffer.c Wed May 12 18:39:03 1999
@@ -1580,7 +1580,7 @@
* with running out of free buffers for loop's "real" device).
*/
-static inline void sync_old_buffers(void)
+static inline void sync_old_buffers(int all)
{
int i;
int ndirty = 0;
@@ -1618,7 +1618,7 @@
bh = lru_list[BUF_DIRTY];
if(bh)
for (i = nr_buffers_type[BUF_DIRTY];
- i-- > 0 && ndirty < bdf_prm.b_un.ndirty;
+ i-- > 0 && (all || ndirty < bdf_prm.b_un.ndirty);
bh = next) {
/* We may have stalled while waiting for
I/O to complete. */
@@ -1644,22 +1644,31 @@
next->b_count++;
bh->b_count++;
ndirty++;
+ /* If not short of memory, flush only older buffers */
+ if (all && time_before(jiffies, bh->b_flushtime))
+ continue;
+#ifdef DEBUG
+ nwritten++;
+#endif
bh->b_flushtime = 0;
if (MAJOR(bh->b_dev) == LOOP_MAJOR) {
ll_rw_block(wrta_cmd,1, &bh);
wrta_cmd = WRITEA;
- if (buffer_dirty(bh))
+ if (buffer_dirty(bh)) {
--ndirty;
+#ifdef DEBUG
+ --nwritten;
+#endif
+ }
}
else
ll_rw_block(WRITE, 1, &bh);
bh->b_count--;
next->b_count--;
}
- /* If we didn't write anything, but there are still
- * dirty buffers, then make the next write to a
- * loop device to be a blocking write.
- * This lets us block--which we _must_ do! */
+ /* If we didn't write anything and there are dirty
+ * buffers, make the next write to a loop device be a
+ * blocking write. This lets us block--which we _must_ do! */
if (ndirty == 0
&& nr_buffers_type[BUF_DIRTY] > 0 && wrta_cmd != WRITE) {
wrta_cmd = WRITE;
@@ -1732,6 +1741,7 @@
{
long remaining = HZ * bdf_prm.b_un.interval;
struct task_struct *tsk = current;
+ int all;
/*
* We have a bare-bones task_struct, and really should fill
@@ -1774,11 +1784,13 @@
sync_supers(0);
sync_inodes(0);
remaining = HZ * bdf_prm.b_un.interval;
- }
+ all = 1;
+ } else
+ all = 0;
/* Keep flushing till there aren't very many dirty buffers */
do {
- sync_old_buffers();
+ sync_old_buffers(all);
} while(nr_buffers_type[BUF_DIRTY] > nr_buffers * bdf_prm.b_un.nfract/100);
wake_up(&bdflush_done);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/