Re: Back in Production mode Again...

Raven Harold (nmraven@yahoo.com)
Tue, 18 Nov 1997 13:16:30 -0800 (PST)


From: Rik van Riel <H.H.vanRiel@fys.ruu.nl>
On Sat, 15 Nov 1997, Larry McVoy wrote:

>> : My question is this: what -is- a high load average? Over most of
the

To back up to his post a second and make a note; the more correct
question may be what is affecting performance. To have a machine whose
resources are all getting hit equally hard and the machine is being
put under a load is acceptable. You make capacity plans for the future
and plan on distributing services on a second machine or whatever...
To have a machine, however, which is having a performance problem
because one resource is continually blocked and it takes a significant
time period to complete any process is a total waste of resources and
in my opinion, totally unacceptable. I think that is where you two are
splitting hairs. Load average can tell you how many processes are
blocked, but I think the general question is more along the lines of
what is blocking processes and load average can't tell you jack about
that. Like you said, that is not the definition of what it is. I'm not
saying that there are not better performance indicators out there that
could be added to the uptime command or something. I have no comment
on that. I am only saying that an apple is not an orange, figuratively.

>disk, this will give you quite some noise from the disks...
>It will also generate a lot of system load...

On one resource, which is exactly the situation anyone would want to
avoid. The other resouces would be virtually unused, but because of
low bandwidth on one resource, the machine is brought to a halt.
Sounds like time to get a few SSA disks. ;)

>> In general, load average as a metric is heavily overrated.
>No, it's just the way loadavg is defined... Just like the
>meter and the mile aren't overrated.
>> Wouldn't it be cool if Linux actually defined a "busy average"?
>Might be quite cool...

Well, then we are no longer talking about LA then are we and this
discussion is not really linux-kernel related is it? In fact, we never
really were since the definition of LA is basically a derivative of
the # of blocked/waiting processes. What we are really discussing is
are there good performance monitors available for linux which can tell
you what is actually blocking processes? I don't know the answer to
that personally. If not though, I'd love to write one, but we'd
probably better move the discussion of it to a more appropriate forum.
I have done more than my share of system-level programming on other
OSs, but like linux and would like to get into more wide-ranging linux
projects. I don't want to re-invent the wheel though, my guess is that
there is a sar, or aix's monitor (written by Jussi Maki) derivitive
available for linux already or at least portable.

>>DISK: current load average (modified for just disk if need be)
>How do you know when the disk is saturated... Do you just measure
>what time there are requests outstanding and when there aren't...

hmmm...that was probably retorical, so this is probably a smart-ass
answer, but....I'm not at my linux machine, I'm at an AIX machine at
the moment, unfortunately. linux's VFS layer changes the constructs
somewhat, but thanks to posix, the concept is basically the same
though. In iostat.h there is a struct called dkstat. dkstat->dk_xrate
would be the saturation point. I believe if (dkstat->dk_rblks +
dkstat->dk_wblks) is over the saturation rate, then the machine would
be blocking reads/writes for the given time period. Also if
dkstat->dk_time is at 100% of the time interval, then you probably
have a problem. Either way, the important concept is for a performance
indicator to show whether or not the machine MAY be blocking
read/writes.

>>NET: %network "full". If IP accounting is on, then full is
>So, when the network is overloaded and you hardly get any packets,
>your system reports a low busy average... Hmmm.

As it should, the network would be busy, not the machine. The machine
would only be busy if the processes were blocked due to an inability
to write to the network and that would show up in the LA. A busy
network is usually characterized be overflowing recieve buffers. You
would probably get too many packets if anything.

Raven
__________________________________________________________________
Sent by Yahoo! Mail. Get your free e-mail at http://mail.yahoo.com