Re: Server process stalled during massive thread creation : schedulerproblem ?

From: Christopher Snook
Date: Mon Sep 15 2008 - 13:02:47 EST

Cornelius, Martin (DWBI) wrote:
Hello scheduler hackers,

i just realized a behaviour of the scheduler that gets me thinking...

This is my test scenario:
On an otherwise unloaded machine, i run a server process that accepts
TCP connections, and after a client has connected, just echoes all the
packets that the client sends. A single client (sitting on another
machine) connects to the server, and then continuously sends packets (of
about 1000 bytes), and reads the echo. For each packet, the client
measures the round-trip-time time it takes to send the packet and
receive the echo. If nothing else happens on the server, this times are
always very short, a few millseconds or less.

While this echoing test is running, i set the server machine under
massive CPU load by starting a load-generating process that starts a
couple of threads. All threads run an endless loop without any I/O or
other blocking.

Behavior with 2.6.27rc6: If the number of threads started in the
load-generating process is sufficiently large (> 100), the server
process seems to be stalled during the startup of the load-generator.
With 100 threads in the load generator, the client observes one or two
round-trip-times of more than 1 second during load-generator startup.
When the load generator starts 1000 threads simultanously, the client is
stalled several times, one of them lasting more than 30 seconds. However, this stalling only appears during the startup of the
load-generator. After some time, the round trip times observed by the
client settle down, and from that point on are all reasonably short

This is expected behavior.

I also conducted this test with older kernels:

2.4.36: With this kernel, behaviour was really weird: When the server
was loaded with >100 threads, the client was stalled again and again for
several seconds, than ran smoothly for some time, until another period
of stalling began. It looked like the scheduler screwed up periodically,
until the bubble birsted and the stalling disappeared for a few seconds. The client is only stalled during the startup of the load
generating process. However, for really long times. With 200 threads in
the load generator, i observered stalling for more than a minute.

Thus, the current kernel seems to pass this test best, but not
perfectly. Of course one might argue (like my colleagues do) that this
test presents a completely unrealistic scenario: hundreds of threads
started at the sime time, all not blocking. However, if i think of a
'BIG' java application server, i can imagine that a similar situation
could arise. From this perspective, one might say the behaviour of the
scheduler is not optimal and should be improved. If a server does not
respond for one minute, it's clients might (reasonably but erronously in
this case) conclude that the server is severely broken.

I would conclude that the application is severely broken, not the server itself. The scheduler is trying to be fair. Unless you're assigning priorities, it has no way of knowing that those 1000 CPU hog processes are less important than your netcat process. Once those processes have shown to be much longer-running than netcat, the kernel realizes that giving netcat priority is the the best approximation to ideal shortest-time-to-completion-first scheduling, so netcat gets to run whenever it's able.

Even big java application servers don't just spin on the CPU forever. They do some amount of setup, and then block until they receive requests. They certainly have a load spike at startup, but not this severe, unless they're badly misconfigured.

What do you think ?

I think the scheduler is working correctly. If you still see this behavior when you give your netcat process higher scheduler priority, then we can talk about bugs.

BTW, the server was equipped with a dual-core Pentium CPU 3.40GHz.

Now I'm even more surprised it held up so well.

-- Chris
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at