Re: Linux 2.6.29

From: Jeff Garzik
Date: Mon Mar 30 2009 - 15:42:53 EST


Rik van Riel wrote:
Linus Torvalds wrote:
And my point is, IT MAKES SENSE to just do the elevator barrier, _without_ the drive command.

No argument there. I have seen NCQ starvation on SATA disks,
with some requests sitting in the drive for seconds, while
the drive was busy handling hundreds of requests/second
elsewhere...

If certain requests are hanging out in the drive's wbcache longer than others, that increases the probability that OS filesystem-required, elevator-provided ordering becomes skewed once requests are passed to drive firmware.

The sad, sucky fact is that NCQ starvation implies FLUSH CACHE is more important than ever, if filesystems want to get ordering correct.




IDEALLY, according to the SATA protocol spec, we could issue up to 32 NCQ commands to a SATA drive, each marked with the "FUA" bit to force the command to hit permanent media before returning.

In theory, this NCQ+FUA mode gives the drive maximum ability to optimize parallel in-progress commands, decoupling command completion and command issue -- while also giving the OS complete control of ordering by virtue of emptying the SATA tagged command queue.

In practice, NCQ+FUA flat out did not work on early drives, and performance was way under what you would expect for parallel write-thru command execution. I haven't benchmarked NCQ+FUA in a few years; it might be worth revisiting.

Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/