Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
> > like completing all reads before issuing new writes, and completing allAS basically does its own TCQ strangulation, which IIRC involves things
> > reads from one process before reads from another. As well as the
> > fundamental way that waiting for a 'dependant read' throttles TCQ.
> > My (mpt-fusion-based) workstation is still really slow when there's a lot
> of writeout happening. Just from a quick test:
> > > 2.6.12-rc2, as, tcq depth=2: 7.241 seconds
> > 2.6.12-rc2, as, tcq depth=64: 12.172 seconds
> > 2.6.12-rc2+patch,as, tcq depth=64: 7.199 seconds
> > 2.6.12-rc2, cfq2, tcq depth=64: much more than 5 minutes
> > 2.6.12-rc2, cfq3, tcq depth=64: much more than 5 minutes
> > 2.6.11-rc4-mm1, as, mpt-f 39.349 seconds
> > That was really really slow but had a sudden burst of read I/O at the end
> which made the thing look better than it really is. I wouldn't have a clue
> what tag depth it's using, and it's the only mpt-fusion based machine I
> have handy...
Well with my current lineup on the mpt-fusion driver and no
as-limit-queue-depth.patch that test takes 17 seconds. With
as-limit-queue-depth.patch it's down to 10 seconds. Which is pretty darn
good btw. I assume from this:
scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=222, IRQ=25
scsi1 : ioc1: LSI53C1030, FwRev=01030600h, Ports=1, MaxQ=222, IRQ=26
that it's using a tag depth of 222.
int req_depth; /* Number of request frames */
I wonder if that's true...
One thing which changed is that this kernel now has the fixed-up mpt-fusion
chipset tuning. That doubles the IO bandwidth, which would pretty well
account for that difference. I'll wait and see how irritating things get
under writeout load.
Yes, we'll need to decide if we want to retain as-limit-queue-depth.patch
and toss out some of the older AS logic which was designed to address the
Steve, could you help to identify a not-too-hard-to-set-up workload at
which AS was particularly poor? Thanks.