On Tue, 6 Nov 2001 09:49:15 -0800 (PST), Linus Torvalds wrote:
> /* Return "current" in %eax, trash %edx */
> do_get_current:
> movl $0x0003c000,%eax // 4 bits at bit 14
> movl $-16384,%edx // remove low 14 bits
> andl $esp,%eax
> andl $esp,%edx
> shrl $7,%eax // color it by 128 bytes
> addl %edx,%eax
> ret
>...
>I would not be surprised if "mov %cr2,%reg" will break a netburst trace
>cache entity, or even cause microcode to be executed. While I _guarantee_
>that all future Intel CPU's will continue to be fast at mixtures of simple
>arithmetic operations like "add" and "and".
On my Pentium 4:
- 6.30 cycles to copy %cr2 to %eax
- 1.05 cycles to compute a non-coloured current by masking %esp
- 2.31 cycles to compute a coloured current by your code above
I did some tests on using %cr2 for get_processor_id() a while ago,
but it was clearly slower (58% on P6, 20% on K6-III, 3% on P5MMX)
than *((%esp & mask)+offset), even though the latter also does a load.
/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Wed Nov 07 2001 - 21:00:32 EST