Re: [PATCH] local_irq_disable removal

From: cutaway
Date: Sat Jun 11 2005 - 22:59:07 EST


FWIW - STI takes 3,5 and 7 cycles on 386,486, and 386SX respectively (these
are all still popular embedded processors)

Do not forget the impact of instruction size on fetch cost - the "book"
cycle counts are only for instructions ALREADY fetched and in the pipe/queue
and IGNORE the cost of fetching said instruction. If you happen to get an
L1 hit, you win. If that LEA or OR stradles a dword or cache line boundary
the CLI/STI may well wind up running faster under combat conditions if line
load or second memory fetch is needed to get the second half of the
instruction.

On an x86 CPU that which appears to be fast per the book clocks is often
slower, and that which appears to clock faster is often slower in reality
due to increased fetch cost of plump instructions. I have many benchmarks
that prove this.

----- Original Message -----
From: "Thomas Gleixner" <tglx@xxxxxxxxxxxxx>
To: "Daniel Walker" <dwalker@xxxxxxxxxx>
Cc: "Ingo Molnar" <mingo@xxxxxxx>; "Esben Nielsen" <simlo@xxxxxxxxxx>;
<linux-kernel@xxxxxxxxxxxxxxx>; <sdietrich@xxxxxxxxxx>
Sent: Saturday, June 11, 2005 19:44
Subject: Re: [PATCH] local_irq_disable removal


> On Sat, 2005-06-11 at 13:51 -0700, Daniel Walker wrote:
> > On Sat, 11 Jun 2005, Ingo Molnar wrote:
>
> > Interesting .. So "cli" takes 7 cycles , "sti" takes 7 cycles. The
current
> > method does "lea" which takes 1 cycle, and "or" which takes 1 cycle.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/