Database server slowdown with "buff" ~= physical memory

From: Stephen J. Gowdy (gowdy@mh1.lbl.gov)
Date: Mon Jun 26 2000 - 10:43:13 EST


Hi All,
        I was wondering if anyone could shed any light on this problem
we're having. Our experiment (BaBar - http://www.slac.stanford.edu/BFROOT/
) started to use Linux this year
(http://chep2000.pd.infn.it/paper/pap-e309.pdf ) as one of our supported
platforms. We take a _lot_ of data, we're getting close to 1TB/day. At the
moment most of this is handled using Solaris 2.6 (with various patches to
support large files) at SLAC. We use an object database and allow the
files it uses to grow to ~10GB each. One of the driving reasons for such
large files is that our database is limited to have 64k files accessible
from it.
        However, to understand this data we do a lot of simulation. As
this is done all over the world we limit the size of these files to
2GB. This was originally due to the Solaris 2.6 limitations but also now
Linux.
        That isn't the problem we're worried about today. The first site
to attempt to setup a production cluster using Linux machines fails to
utilise the full CPU available due to the server machine beginning to
swap. The observation is that the jobs start off running well. If you
watch top on the server machine (which has an IDE RAID array) which runs a
database server process (a page server) the buffer grows to use most of
the physical memory in the machine. The machine has 512MB and the buff
grows to 480MB. Once this happens the machine begins to swap and the
client's CPU utilisation drops dramatically.
        We've found the /proc/sys/vm/buffermem file which sounds like it
should do what we want (ie limit the amount of memory the buffer will
use). However, as I'm sure many of you know, it seems these numbers are
not actually used. I hunted through the kernel source (2.4.0-test2) to see
if I could find anywhere that used them and failed. There is one macro
which uses the min_percent portion but that macro doesn't seem to be used.
        The kernel on the server machine is 2.2.13 at the moment. Does
anyone know a way to limit this buffer size? Or is it a function of file
sizes that are opened (which will grow to n*2GB)? Could this be an
application problem or only kernel? Any other advice?
        Thanks for any help or advice you can offer.

                                                        regards,

                                                        Stephen.

-- 
 /------------------------------+-=-=-=-=-+-------------------------\
|Stephen J. Gowdy               |A4000/040| Mail Stop 50A-2160, LBL, |
|http://www.ph.ed.ac.uk/~gowdy/ | 1GB   HD| 1 Cyclotron Rd, Berkeley,|
|                               |20MB  RAM| CA 94720, USA            |
|InterNet: SGowdy@lbl.gov       |3.4xCDROM| Tel: +1 510 495 2796     |
 \------------------------------+-=-=-=-=-+-------------------------/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jun 26 2000 - 21:00:09 EST