Using kernel 2.2.17, I experience lots of dropped UDP packets. The setup
is as follows:
UDP packets containing measurement data is sent on 100 Mb Ethernet from
a embedded device to a Pentium III, 256 MB, IDE based PC with a 3Com
3C905B network adapter.
The UDP packets always contains 1300 bytes of data, sent at ~5Mbps. The
process at the PC which receives the data, copies them directly into one
of two buffers, adding a timestamp at the end. These buffers is shared
by another process which only purpose is to write full buffers to disk.
The network task is reniced to the highest priority, the disk task to
the lowest priority. Each of the two receive buffers are 10 MB.
At the moment, the data I receive from the embedded device only contains
a packet counter saying: This is packet 1, this is packet 2 etc. If I
loop the data back into another of our cute embedded devices, I am able
to verify that all packets is actually sent, and that they are
transmitted with a constant interval.
Looking at the timestamps, it seems that the packets is dropped mainly
when the disk task calls 'write' in order to flush the buffer to disk.
This happens regardless of whether any data actually is written to the
disk. (The first few times the disk program writes data to disk, Linux
only buffers the data which are to be written, but UDP packets are lost
In order to solve the problem, I have tried to increase
/proc/sys/vm/freepages (to 512 1024 1536), which did not seem to yield
any better performance.
Disabling interrupts on the IDE drives seemed to roughly halve the
number of dropped packets (using /sbin/hdparm -c 1 -d 1 -k 1 -u 1
/dev/hdx) , but the problem is still far worse than acceptable. The
timestamps stored in the file suggest that when the problem occurs,
'recvfrom' might take up to 1.5 seconds to return any data to the
application (reported by the difference in two calls to gettimeofday).
Even though some of the data sent to the UDP socket is buffered by the
network stack, this results in quite a large number of dropped packets.
Upgrading from 2.2.12 to 2.2.17 did not improve the situation either.
So how do I prevent this from happening?
Some of the stuff I have considered:
* I'm eager to test this on a SCSI based system, which is probably what
we will end up using anyway. Would this solve the problems? (as there
does not have to be any disk activity (as indicated by the HDD led) in
order for the problems to occur...)
* Upgrading to a 2.4 kernel, would this eliminate my problems?
* Replacing the network adapter with a better suited model. Any
* Hacking the kernel so that the network driver or the UDP level in the
network stack writes directly to the buffer, leaving only buffer saving
to a userland program.
* Does using raw sockets improve the situation in any way?
* The last and probably most extreme option I have considered is to use
RT Linux, and write/modify the network driver such that a RT task may
write data directly into my buffers in shared memory, leaving the disk
task for a Linux app.
Any suggestions whatsoever would be greatly appreciated. FWIW NT 4.0
running on the same hardware performs this task flawless, and I will
have a diffucult time to convice my boss that we should use Linux as
long as it is outperformed by NT.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Tue Oct 31 2000 - 21:00:14 EST