Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3

From: Michael K. Edwards
Date: Sun Feb 25 2007 - 18:14:30 EST


On 2/25/07, Ingo Molnar <mingo@xxxxxxx> wrote:
Fundamentally a kernel thread is just its
EIP/ESP [on x86, similar on other architectures] - which can be
saved/restored in near zero time.

That's because the kernel address space is identical in every
process's MMU context, so the MMU doesn't have to be touched _at_all_.
Also, the kernel very rarely touches FPU state, and even when it
does, the FXSAVE/FXRSTOR pair is highly optimized for the "save state
just long enough to move some memory around with XMM instructions"
case. (I know you know this; this is for the benefit of less
experienced readers.) If your threadlet model shares the FPU state
and TLS arena among all threadlets running on the same CPU, and
threadlets are scheduled in bursts belonging to the same process (and
preferably the same threadlet entrypoint), then you will get similarly
efficient userspace threadlet-to-threadlet transitions. If not, not.

scheduling only relates to the minimal context that is in the CPU. And
most of that context we save upon /every system call entry/, and restore
it upon every system call return. If it's so expensive to manipulate,
why can the Linux kernel do a full system call in ~150 cycles? That's
cheaper than the access latency to a single DRAM page.

That would be the magic of shadow register files. When the software
does things that hardware expects it to do, everybody wins. When the
software tries to get clever based on micro-benchmarks, everybody
loses.

Cheers,
- Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/