Re: [BUG] long freezes on thinkpad t60

From: Linus Torvalds
Date: Sat Jun 23 2007 - 12:41:50 EST




On Sat, 23 Jun 2007, Miklos Szeredi wrote:
>
> What I notice is that the interrupt distribution between the CPUs is
> very asymmetric like this:
>
> CPU0 CPU1
> 0: 220496 42 IO-APIC-edge timer
> 1: 3841 0 IO-APIC-edge i8042
...
> LOC: 220499 220463
> ERR: 0
>
> and the freezes don't really change that. And the NMI traces show,
> that it's always CPU1 which is spinning in wait_task_inactive().

Well, the LOC thing is for the local apic timer, so while regular
interrupts are indeed very skewed, both CPU's is nicely getting the local
apic timer thing..

That said, the timer interrupt generally happens just a few hundred times
a second, and if there's just a higher likelihood that it happens when the
spinlock is taken, then half-a-second pauses could easily be just because
even when the interrupt happens, it could be skewed to happen when the
lock is held.

And that definitely is the case: the most expensive instruction _by_far_
in that loop is the actual locked instruction that acquires the lock
(especially with the cache-line bouncing around), so an interrupt would be
much more likely to happen right after that one rather than after the
store that releases the lock, which can be buffered.

It can be quite interesting to look at instruction-level cycle profiling
with oprofile, just to see where the costs are..

Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/