Re: Interesting analysis of linux kernel threading by IBM

From: J.D. Bakker (bakker@thorgal.et.tudelft.nl)
Date: Fri Jan 21 2000 - 21:59:22 EST


At 19:21 +0100 21-01-2000, Davide Libenzi wrote:
>Ian Soboroff <ian@cs.umbc.edu> wrote :
>> in the vast majority of cases, i suspect it's easier and probably
>> better to redesign the app than redesign the scheduler. that said,
>> the improvements already done are quite good and needed.
>
>why You all speak about worse designed multithreaded apps.
>Every technology bad applied is worse !
>Suppose You have a rendering job to be done.
>It can be subdivided in a highly parallel system with distinct threads that
>can run together :
>
>1) Viewing transformation
>2) Triangulation
>3) Scan conversion
>4) Texturing
>5) Illumination
>6) Frame output
>
>just to keep it simple.
>Now can someone tell me why I would not split my job into threads ?

Oh, you can split your job in threads. But if you do it like that, you will
get a performance hit; especially on SMP.

As I see it, there are two ways to do multithreaded apps: one way (the one
you show) which is easy for the programmer, and one way that yields high
performance. Think about it: if you have multiple threads communicating in
a pipeline through shared memory, you will have to bounce *lots* of data
between processor caches. The scheduler has little or no impact on that.

When I started multithreading our real-time video compressor, I used the
model you sketched. Profiling on SMP machines showed soon that the system
didn't scale; after reading the kernel source and some (IA32) processor
manuals I saw The Light(tm). The key in getting high performance is
avoiding cache line ping-pong and multiple threads writing to the same
pages; two factors that would be ridiculous to try and handle in kernel
space.

The model that proved optimal is SIMD: splitting the image in multiple
slices which are handled by CPU-bound worker threads. I now use as many
worker threads as there are CPUs in the system, plus two higher priority
I/O threads (and I'm trying to get rid of one of those). This buys me near
linear speedup on a 4-CPU box where the old model would get 60% on a good
day.

<insert Larry McVoy-quote on processes/threads vs pasta/salt>

Sincerely,

Jan-Derk Bakker
[who firmly believes that a long-term (say 1-hour) load average > the
number of CPUs in your system tells you that (a) your userland needs
re-thinking or (b) you are using too light a machine]

--
Jan-Derk Bakker, bakker@mmc.et.tudelft.nl

The lazy man's proverb: 'There's no business like slow business !'

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jan 23 2000 - 21:00:27 EST