More on 2.0.x buffer.c fix (new patch)

Leonard N. Zubkoff (lnz@dandelion.com)
Mon, 17 Feb 1997 01:03:06 -0800


Carlos Fonseca commented to me privately on some potential rounding problems
with the buffer.c patch I sent, which prompted me to investigate this code more
deeply. My conclusion is that this code was incorrect to begin with, and my
recent patch worked primarily by disabling it completely, even though that was
not immediately obvious. I have re-thought this patch further and performed
some experiments, and I believe that the maybe_shrink_lav_buffers code should
be avoided. I also note that it has been removed completely in 2.1.x. The
patch below comments out this code for 2.0.x also.

Recall that in 2.0.20 and earlier, this code was never invoked at all due to a
bug, and 2.0.21 fixed that bug leading to severe performance problems for raw
I/O when shrink_specific_buffers was called far too often.

Looking over the original calculations further, it seems to me that it is quite
probably bogus. My new form is obviously the same as the original assuming
floating point calculations and everything != 0, so let's look at it:

bdf_prm.b_un.lav_const * buffers_lav[nlist] / total_lav <
(nr_buffers_size[nlist] - nr_buffers_st[nlist][BUF_SHARED]) / total_n_buffers)

Now from the calculations above it, the

(nr_buffers_size[nlist] - nr_buffers_st[nlist][BUF_SHARED]) / total_n_buffers)

term is guaranteed to be <= 1, and with integer calculations almost always 0.

The intent of this code appears to be to compare the ratio of lav's to the
ratio of actual buffers, and shrink if something has a lav that's too low for
the number of buffers it actually has (i.e. shrink if the buffers are in use
but not really being actively used). But with lav_const = 1884, I'd say the
probability of this clause ever being invoked correctly is quite small.

My guess is that the use of lav_const should really have been lav_ratio if this
code is to make any sense. With that change, the original calculation is far
less likely to overflow except on machines with a lot of memory, but that's not
sufficient. Using the corrected calculation, performance is almost as bad as
when the overflow occurred. My conclusion is that the shrink_lav code is quite
badly broken from a performance standpoint, probably because it pays no
attention to the relative number of buffers of different sizes. It's happy to
spend a great deal of time trying to cannibalize one of 64 4KB buffers to add
to the 20000+ 1K buffers that were already in use.

Since this code has been ripped out of 2.1.x as no longer needed anyway due to
the page cache, it seems preferable to disable it also in 2.0.x (and restore
performance to 2.0.20 levels), rather than try to make it smarter.

Leonard

--- linux/fs/buffer.c- Fri Sep 20 07:00:35 1996
+++ linux/fs/buffer.c Sun Feb 16 13:42:30 1997
@@ -44,8 +44,10 @@
#define MAX_BUF_PER_PAGE (PAGE_SIZE / 512)

static int grow_buffers(int pri, int size);
+#if 0
static int shrink_specific_buffers(unsigned int priority, int size);
static int maybe_shrink_lav_buffers(int);
+#endif

static int nr_hash = 0; /* Size of hash table */
static struct buffer_head ** hash_table;
@@ -587,12 +589,14 @@
/* See if there are too many buffers of a different size.
If so, victimize them */

+#if 0
while(maybe_shrink_lav_buffers(size))
{
if(!grow_buffers(GFP_BUFFER, size)) break;
needed -= PAGE_SIZE;
if(needed <= 0) return;
};
+#endif

/* OK, we cannot grow the buffer cache, now try to get some
from the lru list */
@@ -1465,6 +1469,8 @@
return !mem_map[MAP_NR(page)].count;
}

+#if 0
+
/* Age buffers on a given page, according to whether they have been
visited recently or not. */
static inline void age_buffer(struct buffer_head *bh)
@@ -1532,12 +1538,13 @@

isize = (size ? BUFSIZE_INDEX(size) : -1);

+
if (n_sizes > 1)
for(nlist = 0; nlist < NR_SIZES; nlist++)
{
if(nlist == isize) continue;
if(nr_buffers_size[nlist] &&
- bdf_prm.b_un.lav_const * buffers_lav[nlist]*total_n_buffers <
+ bdf_prm.b_un.lav_ratio * buffers_lav[nlist]*total_n_buffers <
total_lav * (nr_buffers_size[nlist] - nr_buffers_st[nlist][BUF_SHARED]))
if(shrink_specific_buffers(6, bufferindex_size[nlist]))
return 1;
@@ -1634,7 +1641,7 @@
}
return 0;
}
-
+#endif

/* ================== Debugging =================== */

@@ -1838,13 +1845,14 @@

/* See if one size of buffer is over-represented in the buffer cache,
if so reduce the numbers of buffers */
+#if 0
if(maybe_shrink_lav_buffers(size))
{
int retval;
retval = try_to_generate_cluster(dev, b[0], size);
if(retval) return retval;
};
-
+#endif
if (nr_free_pages > min_free_pages*2)
return try_to_generate_cluster(dev, b[0], size);
else