Re: Linux networking and disk IO issues

From: Alan Cox (alan@lxorguk.ukuu.org.uk)
Date: Mon Jun 04 2001 - 15:02:51 EST


> * The Linux networking stack requires all skbuff buffers to be
> contiguous. As far as I can tell, this makes it impossible to
> write high-bandwidth UDP applications on Linux. For instance, the
> kernel will drop a fragmented 8KB message if it cannot allocate 8KB
> of contiguous memory to reassemble it into. I have found that it
> is relatively easy to enter regimes where this can cause massive
> packet loss.

If you are fragmenting messages then you want to optimise the protocol a bit
more. IP fragmentation increases processing overheads and reduces performance
badly in the presence of link congestion and error.

Most modern file sharing protocols are TCP based for good reason

> * readv()/writev(). Linux serializes scatter/gather IO operations
> into an operation for each iovec entry. This is the relevent code
> from a 2.4-series kernel:

Not on a socket. On a file it makes very little difference. Socket readv/writev
behaviour varies by protocol family.

> * For writes, it forces read-modify-write when the individual
> iovecs are not block-aligned.

>From cache, of data live in the L1 cache of the CPU

> * There is no preadv(), pwritev(). (The pread/pwrite() system calls
> combine a llseek with a read/write system call.) This means that

True. The single unix specification does not include preadv(). Really you want
to take it up with the Opengroup. That said Linux does add syscalls that are
not in SuS sometimes.

> * The requirement that everything about operations to raw character
> device files (length, offset in file, *and* address in memory) has
> to be 512-byte aligned is a real hassle.

Welcome to PC hardware. Large amounts of PC hardware genuinely has limitations
of this nature. Most disk controllers can only write whole sectors on a sector
alignment. Many network controllers can only handle burst or 32bit alignment
policies

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 07 2001 - 21:00:31 EST