Re: sym53c8xx syslog messages from 2.2.14

From: Gerard Roudier (groudier@club-internet.fr)
Date: Tue Feb 22 2000 - 15:02:08 EST

Next message: Julian Anastasov: "Re: accept() improvements for rt signals"
Previous message: Ingo Molnar: "Re: lowlatency-2.2.14-B1 + 2.2.14aa7 fixes crash, but..."
In reply to: Ricky Beam: "Re: sym53c8xx syslog messages from 2.2.14"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 21 Feb 2000, Ricky Beam wrote:

> On 21 Feb 2000, Patrick J. LoPresti wrote:
> >One of our systems (running stock 2.2.14) is generating syslog
> >messages like these very frequently, especially when the machine is
> >busy. I tried lowering the TCQ depth to 8 but it did not make any
> >difference.
>
> How did you set it to 8? You may need to wack the kernel with a 2x4
> to get it to behave (read: delete sections of the scsi driver.) You
> are free to use that same 2x4 on Quantum and the scsi driver author.

For the [sym/ncr]53c8xx driver, you must refer to:

/usr/src/linux/drivers/scsi/README.ncr53c8xx

> >Could someone tell me what they mean and how I can make them stop?
>
> First off, the driver attempts to ajust the queue depth based on
> a number of factors (only one of which is the user.) On a QUEUE FULL
> condition from the drive, the driver drops the queue depth down to
> how ever many commands are currently queued. The queue depth is increased
> after 1000 commands are successful.
>
> First off, the SCSI specs say any device reporting tagged command queuing
> capabilties MUST support a minimum of 8 queued commands. Thus, setting it
> lower than 8 should never be necessary. HOWEVER, some manufacturers'
> (read: Quantum) drives "don't work right." They signal command completion
> before they are ready for the next command -- who knows what kind of crazy
> shit is going on in the microcode.

I have 1 Atlas 1, 2 Atlas 2 and 1 Atlas 4. The Atlas 4 is a lot better,
but the driver copes with all these drives.

> The Quantum Altas I was _very_ _very_ bad at handling queuing. They fixed

You get with Quantum Atlas < 4 drives in mono-initiator what you get with
any other drive in multi-initiator environment. A SCSI subsystem that can
work in multi-initiator environment must work with Quantum drives.

> some of it in later microcode, but it still lies about command completion.
> (I should be able to remove power from the drive the instant it signals
> command completion.)

I does lie only if you enable write behind caching and FUA bit is not set
for writes, which is correct.

> You will need to set the queue depth to _8_ and make sure the damned driver
> never tries to change it.

/usr/src/linux/damned_drivers/scsi/README.ncr53c8xx

> On QUEUE FULL, the driver should wait a second
> and _then_ re-queue the command. I had similar problems with a qlogicpti
> like board in a sparc20 -- in that case, it was the card's command queue
> overflowing instead of a drive, but same difference.

The suggested method is to wait for a command completion from the
offending device prior to requeuing any subsequent command to this device.
The sym53c8xx driver does shrink its internal device queue depth up to the
number of _actually_ disconnect commands which has the _same_ effect.

Then, the device queue depth will be incremented later based on some fixed
amount of _successful_ commands. This scales quite well with a very low
overhead. This has been implemented and tuned based on the behaviour of
Atlas drives.

Gérard.

> --Ricky
>
> PS: I've got some IBM drives eating 64 queued commands :-) This introduces
> some interesting behavior from the kernel thinking commands are timing
> out :-) (4M buffer on the drive, BTW)

I have such a 10000 rpm drive. I use it with 32 tags under Linux and 64
tags under FreeBSD. No problem.
Btw, the Viking II from Quantum also supports such an amount of tagged
commands (may-be only 63, but it is equivalent). This depends mostly on
the firmware.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Julian Anastasov: "Re: accept() improvements for rt signals"
Previous message: Ingo Molnar: "Re: lowlatency-2.2.14-B1 + 2.2.14aa7 fixes crash, but..."
In reply to: Ricky Beam: "Re: sym53c8xx syslog messages from 2.2.14"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Wed Feb 23 2000 - 21:00:31 EST