Re: sata_nv + SMP = broken?

From: Vladimir Lazarenko
Date: Tue Oct 25 2005 - 20:06:32 EST


I too am running an Athlon X2 using sata_nv. I have an ASUS motherboard. But what I noticed was that the problem went away if I used 2 gigs of ram instead of 4 gigs. When you use the whole 4 gigs there is some memory mapping going on and I thought perhaps the problem was related to the sata_nv not liking the memory mapped over the 4gig barrier.
That's possible. Unfortunately I cannot verify this, since there are 2GB of
RAM in my box.

I remeber someone having a problem with sata_nv DMAing over 2GB of RAM,
so there may be something wrong with it.
On a second thought. Why would that only occur in SMP mode? Since now the box is with 3G ram, no SMP and it works like a charm. If I enable SMP - the hell breaks loose.
It looks like an obscure issue. Apparently, to trigger it you need _both_ more that 2GB of RAM and SMP.


Looks like it. I played around today as well, and it seems to not to occur with 2Gb.

OTOH, today I played with Tyan Thunder K8WE, based on the Nvidia
CK8-04 chipset (not that much different to the regular Nforce4) in
a 2-processor configuration and 8GB of RAM, and it had no issues at all
(I installed SuSE 9.3 with the distro kernel on it). Anyway its SATA
controller is handled by the sata_nv driver, so the problem you describe
does not seem to be software-related.

Hehe, as if reading your mind I ordered Tyan K8E today, and in case that won't work, i'll just put a pci raiser there and stuck an external SATA from silicon image or promise there :)

I've heard from another person (thou have not seen it myself) that there was kind of the same problem (hanging sata) on an opteron-based machine. I was too late to ask the mb model thou, so this info is kind of useless, but just for statistical purposes I mention it here.

[Jeff, could you please say if this is a known problem?]

Ok, the last status update.
Tyan K8E arrived today, and after a short while (much longer while than on MSI motherboard thou) it started spitting out errors again (this is in SMP mode):

end_request: I/O error, dev sda, sector 35447511
raid1: sda3: rescheduling sector 27623856
raid1: sda3: redirecting sector 27623856 to another mirror
ATA: abnormal status 0xD0 on port 0x9F7
ATA: abnormal status 0xD0 on port 0x9F7
ATA: abnormal status 0xD0 on port 0x9F7
ata1: command 0x25 timeout, stat 0xd0 host_stat 0x1
ata1: status=0xd0 { Busy }
SCSI error : <0 0 0 0> return code = 0x8000002
sda: Current: sense key: Aborted Command
Additional sense: Scsi parity error

And so on, and so forth, with various blocks.
At this moment in time, the resync of 160G raid-1 was attempted in the background.

MSI mobo from the original mail had nForce4-SLi chipset, whereas this Tyan K8E has nForce4 Ultra.

Yet again, if i enable apic, the boot process hangs here:
sata_nv version 0.6
PCI: Setting latency timer of device 0000:00:07.0 to 64
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xDC00 irq 11
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xDC08 irq 11

These are the last messages that I get. The same behaviour on both motherboard. Shoudl I enable apic, it hangs on that.

When I disable apic, the boot sequence looks like:

sata_nv version 0.6
PCI: Setting latency timer of device 0000:00:07.0 to 64
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xDC00 irq 11
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xDC08 irq 11
ata1: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01 87:4003 88:407f
ata1: dev 0 ATA, max UDMA/133, 320173056 sectors: lba48
nv_sata: Primary device added
nv_sata: Primary device removed
nv_sata: Secondary device added
nv_sata: Secondary device removed
ata1: dev 0 configured for UDMA/133
scsi0 : sata_nv

... Thus, perfectly normal. Further on it goes onto ata2, and so on.

Hereby, the lspci from the new motherboard:
root@anarxi:~# lspci
0000:00:00.0 Memory controller: nVidia Corporation: Unknown device 005e (rev a3)
0000:00:01.0 ISA bridge: nVidia Corporation: Unknown device 0050 (rev a3)
0000:00:01.1 SMBus: nVidia Corporation: Unknown device 0052 (rev a2)
0000:00:06.0 IDE interface: nVidia Corporation: Unknown device 0053 (rev f2)
0000:00:07.0 IDE interface: nVidia Corporation: Unknown device 0054 (rev f3)
0000:00:09.0 PCI bridge: nVidia Corporation: Unknown device 005c (rev a2)
0000:00:0a.0 Bridge: nVidia Corporation: Unknown device 0057 (rev a3)
0000:00:0b.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3)
0000:00:0c.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3)
0000:00:0d.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3)
0000:00:0e.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev a3)
0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
0000:01:06.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 80)
0000:01:07.0 VGA compatible controller: S3 Inc. 86c764/765 [Trio32/64/64V+]


And lspci -v from the IDE parts:

0000:00:06.0 IDE interface: nVidia Corporation: Unknown device 0053 (rev f2) (prog-if 8a [Master SecP PriP])
Subsystem: Tyan Computer: Unknown device 2865
Flags: bus master, 66MHz, fast devsel, latency 0
I/O ports at f000 [size=16]
Capabilities: [44] Power Management version 2

0000:00:07.0 IDE interface: nVidia Corporation: Unknown device 0054 (rev f3) (prog-if 85 [Master SecO PriO])
Subsystem: Tyan Computer: Unknown device 2865
Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 11
I/O ports at 09f0 [size=8]
I/O ports at 0bf0 [size=4]
I/O ports at 0970 [size=8]
I/O ports at 0b70 [size=4]
I/O ports at dc00 [size=16]
Memory at febfd000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2


I'd very much appreciate some feedback on this. The box crashes every 40-50 minutes in SMP mode, and without second core it's overloaded, and running on the edge of crash as well.

Let me know if i need to test anything or you want more info.

Thanks!

Regards,
Vladimir

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature