Re: Linux 2.6.23-rc3

From: Tore Anderson
Date: Wed Aug 15 2007 - 09:03:55 EST


* Tore Anderson

> I can confirm that this patch solves my problem without any side
> effects (as far as I can tell).

I'm sorry I have to retract this statement. When I did some changes
on my storage array, it generated RSCNs and therefore I/O briefly
failed, causing timeouts and path failovers.

I've attached the log of the event from the host running the patched
kernel, and you'll see the -EEXIST message and call trace near the end.

I'm sorry that I can't reproduce this much, because the RSCNs affect
all hosts connected to the SAN, including a number of production
systems which I can't touch more than necessary - unless if you know of
any way to make the HBA drivers (qla2xxx/lpfc) to ignore all RSCN
updates so that I/O is unaffected by them (I'd be forever grateful,
too).

Regards
--
Tore Anderson
kernel: [64163.486095] rport-3:0-2: blocked FC remote port time out: removing target and saving binding
kernel: [64163.486461] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
kernel: [64163.486679] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: sdc: remove path (uevent)
multipathd: www: load table [0 52428800 multipath 0 1 rdac 2 2 round-robin 0 1 1 8:128 1000 round-robin 0 2 1 8:64 1000 8:96 1000]
kernel: [64163.487048] sd 3:0:0:0: [sdc] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
kernel: [64163.487195] sd 3:0:0:1: [sdd] Synchronizing SCSI cache
kernel: [64163.487283] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: sdd: remove path (uevent)
multipathd: mysql: load table [0 41943040 multipath 0 1 rdac 2 2 round-robin 0 1 1 8:144 1000 round-robin 0 2 1 8:80 1000 8:112 1000]
kernel: [64163.487539] sd 3:0:0:1: [sdd] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
kernel: [64165.492301] rport-3:0-3: blocked FC remote port time out: removing target and saving binding
kernel: [64165.492486] device-mapper: multipath: Failing path 8:64.
kernel: [64165.492686] sd 3:0:1:0: [sde] Synchronizing SCSI cache
kernel: [64165.493028] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: 8:64: mark as failed
multipathd: www: remaining active paths: 2
multipathd: sde: remove path (uevent)
kernel: [64165.493251] sd 3:0:1:0: [sde] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
kernel: [64165.493388] sd 3:0:1:1: [sdf] Synchronizing SCSI cache
kernel: [64165.493420] sd 3:0:1:1: [sdf] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
kernel: [64165.587006] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:128
kernel: [64182.453366] scsi 3:0:0:0: Direct-Access SUN CSM200_R 0619 PQ: 0 ANSI: 5
kernel: [64182.460960] sd 3:0:0:0: [sdc] 52428800 512-byte hardware sectors (26844 MB)
kernel: [64182.461109] sd 3:0:0:0: [sdc] Write Protect is off
kernel: [64182.461112] sd 3:0:0:0: [sdc] Mode Sense: 77 00 10 08
kernel: [64182.461467] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
kernel: [64182.462332] sd 3:0:0:0: [sdc] 52428800 512-byte hardware sectors (26844 MB)
kernel: [64182.462635] sd 3:0:0:0: [sdc] Write Protect is off
kernel: [64182.462640] sd 3:0:0:0: [sdc] Mode Sense: 77 00 10 08
kernel: [64182.462944] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
kernel: [64182.462950] sdc:end_request: I/O error, dev sdc, sector 0
kernel: [64182.967344] printk: 14 messages suppressed.
kernel: [64182.967348] Buffer I/O error on device sdc, logical block 0
kernel: [64183.466221] end_request: I/O error, dev sdc, sector 0
kernel: [64183.466232] Buffer I/O error on device sdc, logical block 0
kernel: [64183.982083] end_request: I/O error, dev sdc, sector 0
kernel: [64183.982096] Buffer I/O error on device sdc, logical block 0
kernel: [64184.497237] end_request: I/O error, dev sdc, sector 0
kernel: [64184.497248] Buffer I/O error on device sdc, logical block 0
kernel: [64185.004086] end_request: I/O error, dev sdc, sector 0
kernel: [64185.004096] Buffer I/O error on device sdc, logical block 0
kernel: [64185.004139] ldm_validate_partition_table(): Disk read failed.
kernel: [64185.504158] end_request: I/O error, dev sdc, sector 0
kernel: [64185.504170] Buffer I/O error on device sdc, logical block 0
kernel: [64186.003138] end_request: I/O error, dev sdc, sector 0
kernel: [64186.003147] Buffer I/O error on device sdc, logical block 0
kernel: [64186.501978] end_request: I/O error, dev sdc, sector 0
kernel: [64186.501987] Buffer I/O error on device sdc, logical block 0
kernel: [64187.001022] end_request: I/O error, dev sdc, sector 0
kernel: [64187.001035] Buffer I/O error on device sdc, logical block 0
kernel: [64187.001104] Dev sdc: unable to read RDB block 0
kernel: [64187.500628] end_request: I/O error, dev sdc, sector 0
kernel: [64187.500640] Buffer I/O error on device sdc, logical block 0
kernel: [64188.003688] end_request: I/O error, dev sdc, sector 0
kernel: [64188.003698] Buffer I/O error on device sdc, logical block 0
kernel: [64188.502658] end_request: I/O error, dev sdc, sector 24
kernel: [64189.002093] end_request: I/O error, dev sdc, sector 24
kernel: [64189.500933] end_request: I/O error, dev sdc, sector 0
kernel: [64190.012551] end_request: I/O error, dev sdc, sector 0
kernel: [64190.012619] unable to read partition table
kernel: [64190.012740] sd 3:0:0:0: [sdc] Attached SCSI disk
kernel: [64190.012856] sd 3:0:0:0: Attached scsi generic sg2 type 0
kernel: [64190.021118] scsi 3:0:0:1: Direct-Access SUN CSM200_R 0619 PQ: 0 ANSI: 5
kernel: [64190.022180] sd 3:0:0:1: [sdd] 41943040 512-byte hardware sectors (21475 MB)
kernel: [64190.022336] sd 3:0:0:1: [sdd] Write Protect is off
kernel: [64190.022345] sd 3:0:0:1: [sdd] Mode Sense: 77 00 10 08
kernel: [64190.022761] sd 3:0:0:1: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
kernel: [64190.024031] sd 3:0:0:1: [sdd] 41943040 512-byte hardware sectors (21475 MB)
kernel: [64190.024190] sd 3:0:0:1: [sdd] Write Protect is off
kernel: [64190.024195] sd 3:0:0:1: [sdd] Mode Sense: 77 00 10 08
kernel: [64190.024789] sd 3:0:0:1: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
kernel: [64190.024795] sdd:end_request: I/O error, dev sdd, sector 0
kernel: [64191.027699] end_request: I/O error, dev sdd, sector 0
kernel: [64191.528449] end_request: I/O error, dev sdd, sector 0
kernel: [64192.029147] end_request: I/O error, dev sdd, sector 0
kernel: [64192.528175] end_request: I/O error, dev sdd, sector 0
kernel: [64192.528216] ldm_validate_partition_table(): Disk read failed.
kernel: [64193.027787] end_request: I/O error, dev sdd, sector 0
kernel: [64193.027800] printk: 9 messages suppressed.
kernel: [64193.027802] Buffer I/O error on device sdd, logical block 0
kernel: [64193.526635] end_request: I/O error, dev sdd, sector 0
kernel: [64194.025634] end_request: I/O error, dev sdd, sector 0
kernel: [64194.525426] end_request: I/O error, dev sdd, sector 0
kernel: [64194.525475] Dev sdd: unable to read RDB block 0
kernel: [64195.024238] end_request: I/O error, dev sdd, sector 0
kernel: [64195.523407] end_request: I/O error, dev sdd, sector 0
kernel: [64196.029555] end_request: I/O error, dev sdd, sector 24
kernel: [64196.528494] end_request: I/O error, dev sdd, sector 24
kernel: [64197.027332] end_request: I/O error, dev sdd, sector 0
kernel: [64197.529308] end_request: I/O error, dev sdd, sector 0
kernel: [64197.529345] unable to read partition table
kernel: [64197.529447] sd 3:0:0:1: [sdd] Attached SCSI disk
kernel: [64197.529508] sd 3:0:0:1: Attached scsi generic sg3 type 0
kernel: [64197.531697] scsi 3:0:1:0: Direct-Access SUN CSM200_R 0619 PQ: 0 ANSI: 5
kernel: [64197.533043] sd 3:0:1:0: [sdk] 52428800 512-byte hardware sectors (26844 MB)
kernel: [64197.533211] sd 3:0:1:0: [sdk] Write Protect is off
kernel: [64197.533213] sd 3:0:1:0: [sdk] Mode Sense: 77 00 10 08
kernel: [64197.533481] sd 3:0:1:0: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA
kernel: [64197.534349] sd 3:0:1:0: [sdk] 52428800 512-byte hardware sectors (26844 MB)
kernel: [64197.534493] sd 3:0:1:0: [sdk] Write Protect is off
kernel: [64197.534496] sd 3:0:1:0: [sdk] Mode Sense: 77 00 10 08
kernel: [64197.534740] sd 3:0:1:0: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA
kernel: [64197.534743] sdk: unknown partition table
kernel: [64197.535223] sd 3:0:1:0: [sdk] Attached SCSI disk
kernel: [64197.535273] sd 3:0:1:0: Attached scsi generic sg4 type 0
kernel: [64197.540762] scsi 3:0:1:0: Direct-Access SUN CSM200_R 0619 PQ: 0 ANSI: 5
kernel: [64197.540779] kobject_add failed for 3:0:1:0 with -EEXIST, don't try to register things with the same name in the same directory.
kernel: [64197.540833]
kernel: [64197.540833] Call Trace:
kernel: [64197.540852] [kobject_shadow_add+402/512] kobject_shadow_add+0x192/0x200
kernel: [64197.540857] [device_add+204/1568] device_add+0xcc/0x620
kernel: [64197.540878] [_end+127152405/2129771920] :scsi_mod:scsi_sysfs_add_sdev+0x55/0x290
kernel: [64197.540888] [_end+127142735/2129771920] :scsi_mod:scsi_probe_and_add_lun+0x35f/0xc80
kernel: [64197.540901] [_end+127135573/2129771920] :scsi_mod:scsi_execute_req+0xb5/0xe0
kernel: [64197.540912] [_end+127146693/2129771920] :scsi_mod:__scsi_scan_target+0x655/0x6d0
kernel: [64197.540924] [_end+127148286/2129771920] :scsi_mod:scsi_scan_target+0xfe/0x100
kernel: [64197.540936] [_end+127780019/2129771920] :scsi_transport_fc:fc_scsi_scan_rport+0x73/0xa0
kernel: [64197.540941] [_end+127779904/2129771920] :scsi_transport_fc:fc_scsi_scan_rport+0x0/0xa0
kernel: [64197.540945] [run_workqueue+107/288] run_workqueue+0x6b/0x120
kernel: [64197.540947] [worker_thread+0/304] worker_thread+0x0/0x130
kernel: [64197.540949] [worker_thread+197/304] worker_thread+0xc5/0x130
kernel: [64197.540953] [autoremove_wake_function+0/48] autoremove_wake_function+0x0/0x30
kernel: [64197.540955] [worker_thread+0/304] worker_thread+0x0/0x130
kernel: [64197.540957] [worker_thread+0/304] worker_thread+0x0/0x130
kernel: [64197.540959] [kthread+134/144] kthread+0x86/0x90
kernel: [64197.540962] [child_rip+10/18] child_rip+0xa/0x12
kernel: [64197.540966] [kthread+0/144] kthread+0x0/0x90
kernel: [64197.540968] [child_rip+0/18] child_rip+0x0/0x12
kernel: [64197.540969]
kernel: [64197.540970] error 1
kernel: [64197.540979] scsi 3:0:1:0: Unexpected response from lun 0 while scanning, scan aborted
kernel: [64229.282429] device-mapper: multipath: Failing path 8:128.
multipathd: www: load table [0 52428800 multipath 0 1 rdac 2 2 round-robin 0 1 1 8:128 1000 round-robin 0 1 1 8:96 1000]
multipathd: sdf: remove path (uevent)
multipathd: mysql: load table [0 41943040 multipath 0 1 rdac 2 2 round-robin 0 1 1 8:144 1000 round-robin 0 1 1 8:112 1000]
multipathd: sdc: add path (uevent)
kernel: [64229.291017] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: www: load table [0 52428800 multipath 0 1 rdac 2 2 round-robin 0 2 1 8:128 1000 8:32 1000 round-robin 0 1 1 8:96 1000]
kernel: [64229.609667] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: sdd: add path (uevent)
kernel: [64231.423039] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: mysql: load table [0 41943040 multipath 0 1 rdac 2 2 round-robin 0 2 1 8:144 1000 8:48 1000 round-robin 0 1 1 8:112 1000]
multipathd: sdk: add path (uevent)
kernel: [64231.711731] device-mapper: multipath rdac: using RDAC command with timeout 6000
multipathd: www: load table [0 52428800 multipath 0 1 rdac 2 2 round-robin 0 2 1 8:128 1000 8:32 1000 round-robin 0 2 1 8:96 1000 8:160 100