Re: AMD64 X2 lost ticks on PM timer

From: Martin Schlemmer
Date: Sat Mar 04 2006 - 07:05:30 EST


On Fri, 2006-03-03 at 18:43 -0500, Bill Rugolsky Jr. wrote:
> On Fri, Mar 03, 2006 at 05:09:57PM -0500, Jeff Garzik wrote:
> > Or sata_nv/libata is to blame.
>
> In case you are coming late to the thread:
>
> The lost ticks are closely correlated with sata_nv disk activity on
> multiple disks, and the problem is easily reproducable with "find /usr |
> cpio -o >/dev/null" on an MD RAID1 -- but not on a single disk.
>
> Andi suggested:
>
> Yes, I bet something forgets to turn on interrupts again and it's
> picked up by (and blamed on) the next guy who does an unconditional
> sti, which happens to be __do_sofitrq or idle.
>
> That sounds right to me.
>
> I built 2.6.16-rc5-git6 yesterday, and it still suffers from the same
> issue.
>

Not sure this will help in anyway, but anyhow.

I have had this system for about 6-8 months (maybe 10) now. It was
originally a Asus A8N-SLI Deluxe with a 3200+ Athlon64. In November I
changed to a Asus A8N-SLI Premium, and added another 1GB memory (now
have 2GB memory). In all that time I have not had any issues with lost
ticks, but I was hesitant to get a X2 processor due to the issue that
some people had.

Beginning February I got an Athlon64 X2 3800+ processor, and since that
time I have also had no issues with lost ticks. I usually run latest
git kernel +/- a week. No extra patches, except I started using Alan's
libata PATA stuff a week or so back as well.

Only difference I can see that might be of consequence, is:

1) Board type? Not sure if anybody with an A8N-SLI had any lost tick
issues ?

2) I do not use MD RAID1, but have 2 ST380013AS striped with
device-mapper's stripe module.

3) If I remember correctly, then I have the 2 hdd's on nv_sata ports 3
and 4, with ports 1 and 2 disabled, else they show up as sdc and sdd.
Not sure if its an A8N-SLI only peculiarity, and not 100% sure of the
port order - I can check if it might be an issue?

4) I do not use the NV lan adapter, as I had issues back when I got the
system, but rather the extra Marvell controller (skge module). I think
it picked up fine, etc, but it lost connection after a minute or two.

5) I run the processor at 240FSB (or HT or whatever) with the memory at
333 (166 on some boards) multiplier (ending up at running 200Mhz
anyhow). Not sure if this might make any difference, but just listing
it in case.

6) Haven't checked if this makes any difference, but the board have an
option for ACPI 2.0 support, which I have enabled.

If anything might be of relevance, or you want me to try something, just
say it. Same with extra info that might be needed.


Regards,

--
Martin Schlemmer

Attachment: signature.asc
Description: This is a digitally signed message part