Re: [00/17] Large Blocksize Support V3

From: Jeremy Higdon
Date: Thu Apr 26 2007 - 20:19:30 EST


On Thu, Apr 26, 2007 at 11:50:33PM +1000, David Chinner wrote:
> On Thu, Apr 26, 2007 at 04:10:32AM -0600, Eric W. Biederman wrote:
> > > And then there's the problem that most hardware is limited to 128
> > > s/g entries and that means 128 non-contiguous pages in memory is the
> > > maximum I/O size we can issue to these devices. We have RAID arrays
> > > that go twice as fast if we can send them 1MB I/Os instead of 512k
> > > I/Os and that means we need contiguous pages to be handled to the
> > > devices....
> >
> > Ok. Now why are high end hardware manufacturers building crippled
> > hardware? Or is there only an 8bit field in SCSI for describing
> > scatter gather entries? Although I would think this would be
> > move of a controller ranter than a drive issue.
>
> scsi.h:
>
> /*
> * The maximum sg list length SCSI can cope with
> * (currently must be a power of 2 between 32 and 256)
> */
> #define SCSI_MAX_PHYS_SEGMENTS MAX_PHYS_SEGMENTS
>
> And from blkdev.h:
>
> #define MAX_PHYS_SEGMENTS 128
> #define MAX_HW_SEGMENTS 128
>
> So currentlt on SCSI we are limited to 128 s/g entries, and the
> maximum is 256. So I'd say we've got good grounds for needing
> contiguous pages to go beyond 1MB I/O size on x86_64.

Right, and there are also RAID devices that really want a 2 MiB I/O
size. Even if we could use 512 s/g entries (which would take two
pages), the other big problem is that many I/O chips/cards are limited
in the amount of space they have for s/g lists. So, you'd face the
possibility that you could do a 2MiB I/O request with 512 s/g entries,
but then you couldn't start a second request on that host until the
first one finished.

jeremy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/