On Wed, 21 Nov 2007 02:43:46 +1100..
Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
On Wednesday 21 November 2007 01:47, Arjan van de Ven wrote:On Tue, 20 Nov 2007 18:37:39 +1100Of course it is, if you want to effectively use your resources.
Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:irq load fluctuates by definition. but acting on it faster isn't theactually.... no. IRQ balancing is not a "fast" decision; everyI didn't say anything of the sort. But IRQ load could still
time you
fluctuate a lot more rapidly than we'd like to wake up the
irqbalancer.
right thing.
Imagine if the task balancer only polled once every 10s.
but unlike the task balancer, moving an irq is really expensive.
(at least for networking and a few other similar systems)
ANd no it's not just the cache bouncing, it's the entire reassembly of
multiple packets etc etc that gets really messy.
I assume you've read what/how irqbalance does; good luck convincingLots of code to get topology and device information.
people that that kind of policy belongs in the kernel.
yes this would go away in the kernel
Some constants
that make assumptions about the machine it is running on and may or
may not agree with what the task scheduler is trying to do.
Some
classification stuff which makes guesses about how a particular bit of
you misunderstood this; the classification stuff is there to spread
different irqs of similar class (say networking) over multiple
cores/packages. Doing this is a system resource balancing proposition
not just a cpu time one.
You may think this spreading based on classification is a mistake, but
it's based on the following observation: 1) servers with multiple network cards serving internet traffic out
really need to load balance their loads; this is for various per-cpu
resource reasons (such as per cpu memory pools) to be evenly used. It
also makes sure that under network spikes on both interfaces, the
response is sane
2) servers with multiple IO devices need this to be spread out, just
think of oracle etc.
for both you could argue "but we could balance this based on actual
observed load in some way", but you can only do that if you rebalance
at a relatively high frequency, which you really don't want to do for
networking and probably even storage.
We used to rebalance this frequently in the 2.4-early kernels based on
a patch from Ingo. Turned out to be a really really bad idea;
performance really tanked.
hardware or device driver wants to be balanced. Hacks to poll
hotplugging and topology changes.
"hacks" as in "rescan".. so falls under the topology code and would
indeed be changed to hook into hotplug inside the kernel; just
different complexity.
I'm still convinced. Who isn't?
I know you can do SOME sort of balancing in the kernel. But please
describe the algorithm you would use; I started out with the same
thought but when it got down to the algorithm to me at least it became
clear "we really don't want this complexity in kernel mode".