Re: [RESEND][2.6.15] New ATA error messages on upgrade to 2.6.15

From: Eric D. Mudama
Date: Fri Mar 31 2006 - 17:18:17 EST


can you please post your dmesg?

On 3/31/06, Kyle Moffett <mrmacman_g4@xxxxxxx> wrote:
> I haven't received any response to this email over the last several
> days, so I'm resending it in hopes that someone can help me track
> down the problem. Thanks!
>
> On Mar 28, 2006, at 12:57:29, Kyle Moffett wrote:
> > I'm getting the following errors in my syslog on a fairly regular
> > basis (1 or 2 per hour). They seem to have started when I upgraded
> > from Debian 2.6.12-1-powerpc (with internal IDE patch) to Debian
> > 2.6.15-1-powerpc. My best guess is for some reason the kernel
> > started issuing the command MULTWRITE_EXT that it didn't before,
> > and one of my drives doesn't like it. Two of the drives are
> > attached to a Promise PDC20268 (Rebranded a couple times by
> > different manufacturers and with mac-bootable firmware), the third
> > is attached to the internal ATA66 bus in the 400MHz powermac G4.
> > My apologies if this problem is known and fixed in 2.6.16; if
> > necessary I'll wait until Debian testing gets a 2.6.16 kernel and
> > test that too.
>
> A few extra notes: The system is a Samba fileserver for a collection
> of Windows XP clients, but the pattern of errors does not seem to be
> triggered by load, including the daily backups. The hourly smart
> checks also appear to have nothing to do with the error messages;
> sometimes I'll get 10 in the middle of the night, other times almost
> a full day of reasonable load will go by without a single message.
>
> > Thanks for the help!
> >
> > Cheers,
> > Kyle Moffett
> >
> > Begin forwarded message:
> >> Security Events
> >> =-=-=-=-=-=-=-=
> >> Mar 28 03:15:13 penelope kernel: ide: failed opcode was: unknown
> >> Mar 28 03:30:13 penelope kernel: ide: failed opcode was: unknown
> >>
> >> System Events
> >> =-=-=-=-=-=-=
> >> Mar 28 03:15:13 penelope kernel: hdi: status timeout: status=0xd0
> >> { Busy }
> >> Mar 28 03:15:13 penelope kernel: PDC202XX: Secondary channel reset.
> >> Mar 28 03:15:13 penelope kernel: hdi: no DRQ after issuing
> >> MULTWRITE_EXT
> >> Mar 28 03:15:13 penelope kernel: ide4: reset: success
> >> Mar 28 03:30:13 penelope kernel: hdi: status timeout: status=0xd0
> >> { Busy }
> >> Mar 28 03:30:13 penelope kernel: PDC202XX: Secondary channel reset.
> >> Mar 28 03:30:13 penelope kernel: hdi: no DRQ after issuing
> >> MULTWRITE_EXT
> >> Mar 28 03:30:13 penelope kernel: ide4: reset: success
> >
> > smartctl -a:
> >> smartctl version 5.34 [powerpc-unknown-linux-gnu] Copyright (C)
> >> 2002-5 Bruce Allen
> >> Home page is http://smartmontools.sourceforge.net/
> >>
> >> === START OF INFORMATION SECTION ===
> >> Device Model: SAMSUNG SP0822N
> >> Serial Number: S06QJ10Y946116
> >> Firmware Version: WA100-32
> >> User Capacity: 80,060,424,192 bytes
> >> Device is: In smartctl database [for details use: -P show]
> >> ATA Version is: 6
> >> ATA Standard is: ATA/ATAPI-6 T13 1410D revision 1
> >> Local Time is: Tue Mar 28 12:54:35 2006 EST
> >> SMART support is: Available - device has SMART capability.
> >> SMART support is: Enabled
> >>
> >> === START OF READ SMART DATA SECTION ===
> >> SMART overall-health self-assessment test result: PASSED
> >>
> >> General SMART Values:
> >> Offline data collection status: (0x84) Offline data collection
> >> activity
> >> was suspended by an
> >> interrupting command from host.
> >> Auto Offline Data
> >> Collection: Enabled.
> >> Self-test execution status: ( 0) The previous self-test
> >> routine completed
> >> without error or no self-
> >> test has ever
> >> been run.
> >> Total time to complete Offline
> >> data collection: (1980) seconds.
> >> Offline data collection
> >> capabilities: (0x5b) SMART execute Offline
> >> immediate.
> >> Auto Offline data
> >> collection on/off support.
> >> Suspend Offline collection
> >> upon new
> >> command.
> >> Offline surface scan
> >> supported.
> >> Self-test supported.
> >> No Conveyance Self-test
> >> supported.
> >> Selective Self-test
> >> supported.
> >> SMART capabilities: (0x0003) Saves SMART data before
> >> entering
> >> power-saving mode.
> >> Supports SMART auto save
> >> timer.
> >> Error logging capability: (0x01) Error logging supported.
> >> No General Purpose Logging
> >> support.
> >> Short self-test routine
> >> recommended polling time: ( 2) minutes.
> >> Extended self-test routine
> >> recommended polling time: ( 33) minutes.
> >>
> >> SMART Attributes Data Structure revision number: 17
> >> Vendor Specific SMART Attributes with Thresholds:
> >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> >> UPDATED WHEN_FAILED RAW_VALUE
> >> 1 Raw_Read_Error_Rate 0x000f 100 099 051 Pre-fail
> >> Always - 0
> >> 3 Spin_Up_Time 0x0007 252 252 011 Pre-fail
> >> Always - 0
> >> 4 Start_Stop_Count 0x0032 252 252 000 Old_age
> >> Always - 0
> >> 5 Reallocated_Sector_Ct 0x0033 252 252 011 Pre-fail
> >> Always - 0
> >> 7 Seek_Error_Rate 0x000f 252 252 051 Pre-fail
> >> Always - 0
> >> 8 Seek_Time_Performance 0x0025 092 092 015 Pre-fail
> >> Offline - 3665
> >> 9 Power_On_Half_Minutes 0x0032 099 099 000 Old_age
> >> Always - 32h+43m
> >> 10 Spin_Retry_Count 0x0033 252 252 051 Pre-fail
> >> Always - 0
> >> 11 Calibration_Retry_Count 0x0012 252 252 000 Old_age
> >> Always - 0
> >> 12 Power_Cycle_Count 0x0032 252 252 000 Old_age
> >> Always - 0
> >> 190 Unknown_Attribute 0x0022 154 133 000 Old_age
> >> Always - 33
> >> 194 Temperature_Celsius 0x0022 151 133 000 Old_age
> >> Always - 34
> >> 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age
> >> Always - 0
> >> 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age
> >> Always - 0
> >> 197 Current_Pending_Sector 0x0012 252 252 000 Old_age
> >> Always - 0
> >> 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age
> >> Offline - 0
> >> 199 UDMA_CRC_Error_Count 0x003e 199 199 000 Old_age
> >> Always - 173
> >> 200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age
> >> Always - 0
> >> 201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age
> >> Always - 0
> >>
> >> Warning! SMART ATA Error Log Structure error: invalid SMART checksum.
> >> SMART Error Log Version: 1
> >> No Errors Logged
> >>
> >> SMART Self-test log structure revision number 0
> >> Warning: ATA Specification requires self-test log structure
> >> revision number = 1
> >> Num Test_Description Status Remaining
> >> LifeTime(hours) LBA_of_first_error
> >> # 1 Short offline Completed without error 00%
> >> 3918 -
> >> # 2 Short offline Completed without error 00%
> >> 3894 -
> >> # 3 Extended offline Interrupted (host reset) 30%
> >> 3870 -
> >> # 4 Short offline Completed without error 00%
> >> 3846 -
> >> # 5 Short offline Completed without error 00%
> >> 3822 -
> >> # 6 Short offline Completed without error 00%
> >> 3798 -
> >> # 7 Short offline Completed without error 00%
> >> 3774 -
> >> # 8 Short offline Completed without error 00%
> >> 3750 -
> >> # 9 Short offline Completed without error 00%
> >> 3726 -
> >> #10 Extended offline Completed without error 00%
> >> 3703 -
> >> #11 Short offline Completed without error 00%
> >> 3678 -
> >> #12 Short offline Completed without error 00%
> >> 3654 -
> >> #13 Short offline Completed without error 00%
> >> 3630 -
> >> #14 Short offline Completed without error 00%
> >> 3606 -
> >> #15 Short offline Completed without error 00%
> >> 3582 -
> >> #16 Short offline Completed without error 00%
> >> 3558 -
> >> #17 Extended offline Completed without error 00%
> >> 3535 -
> >> #18 Short offline Completed without error 00%
> >> 3511 -
> >> #19 Short offline Completed without error 00%
> >> 3487 -
> >> #20 Short offline Completed without error 00%
> >> 3463 -
> >> #21 Short offline Completed without error 00%
> >> 3439 -
> >>
> >> SMART Selective Self-Test Log Data Structure Revision Number (0)
> >> should be 1
> >> SMART Selective self-test log data structure revision number 0
> >> Warning: ATA Specification requires selective self-test log data
> >> structure revision number = 1
> >> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
> >> 1 0 0 Not_testing
> >> 2 0 0 Not_testing
> >> 3 281479271677952 0 Not_testing
> >> 4 0 281479271767952 Not_testing
> >> 5 604800 4 Not_testing
> >> Selective self-test flags (0x0):
> >> After scanning selected spans, do NOT read-scan remainder of disk.
> >> If Selective self-test is pending on power-up, resume after 0
> >> minute delay.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/