Re: ide drive dying?

From: jbradford@dial.pipex.com
Date: Sat Sep 07 2002 - 01:09:50 EST


> > Now that the Smart Suite S.M.A.R.T. applications are unmaintained, would
>
> what happened?

I'm not sure, but the last update to the S.M.A.R.T. Suite website, on 3 July this year, says that the page and the applications are no longer maintained.

Seems the Beta of version 2.0 never got finished either :-(.

> > there be any chance of implementing S.M.A.R.T. in to the kernel IDE code?
>
> what would be the benefit? as I understand it, smart is really
> a means of reporting long-term disk status, which is optimally done
> by user-space. even something exotic like failing over to a spare disk
> would clearly be best done in user-space.

You are right, the idea is to monitor the smart info, ideally from when the drive is new, but at least over a period of time, so that a change in it's behavior shows up.

> > I know the IDE code is already a nightmare, but it would be a nice feature.
>
> what did you have in mind?

Well, nothing very exotic, just some sanity checks on the SMART data when the IDE and SCSI interfaces are probed for devices. Something like:

* Device supports/does not support following SMART features:
  * General attributes
  * Vendor attributes
  * Error log
  * Selftest log
  * Drive info

* SMART is currently enabled/disabled

* Total power-on time is currently foo hours

* Warning if any of the following is excessive:

  * Last spin up time
  * Calibration retry count
  * UDMA CRC Error count

> > S.M.A.R.T. is terribly under used at the moment - most people don't even
> > know what it is. Infact, I could be wrong, but isn't a subset of
> > S.M.A.R.T. implemented on modern SCSI disks, too?
>
> I know that most people don't run it, but other than that, how is it
> underused?

Well, I can't see any reason for *not* using it where available - who wouldn't appreciate a warning on boot up, 'oh, by the way, /dev/hda is about to die in a couple of days :-)'

> > Monitoring of any kind is always a nice feature to have...
>
> certainly, though that doesn't mean it should move from userspace to
> kernel...

Agreed, there isn't any point in doing monitoring in kernelspace, but capabilities reporting, and sanity checks on boot might be useful.

John.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 07 2002 - 22:00:31 EST