Re: pipe(2), read/write, maximums and behavior.

From: Eric Dumazet
Date: Mon Jul 06 2009 - 05:03:55 EST


Linda Walsh a écrit :
> I've seen a few shells claim to limit pipe sizes to 8 512Byte buffers.
> Don't know where they get this value or how they think it applies, but
> it certainly doesn't seem to apply in linux. However, I'm not
> sure what limits do apply compared to available memory.
> I suppose, starting off, one might look at at a maximum of
> (Physical+Swap-resident-non-swappable mem)/2 as a top limit.
>
> A test machine I have has 8GB physical memory with a bit over 4GB
> of swap space making for about 12GB of memory.
>
> If total memory was to go toward my proglet that splits into a master
> writer and slave pipe reader, they'd have to split memory to have
> matching buffer read/write sizes. I'd "expect", (I think) at least
> a 2GB write/read to work, and possibly a 4GB write/read to work
> with alot of swap activity -- that's assuming there are no other
> restraints in dividing 12GB of address space.
>
> As it turns out -- the program dies at 2GB (the 1GB write/read works)
> but when the program tries a 2GB write & read it refuses the full write
> and the child gets less than 2GB.
>
> The master gets back that it wrote 2097148KB, though it tried to
> write 2097152KB (and the child receives the 2GB-4K buffer upon read).
>
> This is on a x86_64 machine, and unsigned long values are 8-bytes
> wide and being used with the read and write calls for lengths.
>
> Shouldn't a 2GB read/write work? At most, together the master
> and slave would have only used 4GB for each to have a 2GB buffer.
>
> How would one determine the maximum size for 1 huge read or write
> through the pipe (from the pipe system call)?
>
> On 2GHz multi-core machines, I get about 512MB/s throughput.
>
> I attached the source file so anyone can see my methodology.
>
> you have to include "-lrt" on the gcc command line as it uses
> clock_gettime to estimate the time for the write call (the read
> call always comes back with values too small to be reasonable, so
> I don't bother printing them.
>
>
>

read()/write() system calls use generic vfs_read()/vfs_write() calls,
that in turn use rw_verify_area() which limits 'count' of bytes
to MAX_RW_COUNT

#define MAX_RW_COUNT (INT_MAX & PAGE_CACHE_MASK)

So yes, this currently limits to 2GB - (PAGE_SIZE) (PAGE_SIZE=4KB on i386),
even on x86_64 kernels.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/