Re: posible latency issues in seq_read

From: Eric Dumazet
Date: Fri Jul 20 2007 - 23:47:27 EST


Chris Friesen a écrit :
Lee Revell wrote:
On 7/20/07, Chris Friesen <cfriesen@xxxxxxxxxx> wrote:

We've run into an issue (on 2.6.10) where calling "lsof" triggers lost
packets on our server. Preempt is disabled, and NAPI is enabled.

Can you reproduce with a recent kernel? Lots of latency issues have
been fixed since then.

Unfortunately I have to fix it on this version (the bug was found on shipped product), so if there was a difference I'd have to isolate the changes and backport them. Also, I can't run the software that triggers the problem on a newer kernel as it has dependencies on various patches that are not in mainline.

Basically what I'd like to know is whether calling schedule() in seq_read() is safe or whether it would break assumptions made by seq_file users.


It wont help much. seq_read() is fine in itself.

The problem is in established_get_next() and established_get_first() not allowing softirq processing, while scanning a possibly huge hash table, even if few sockets are hashed in.

As cond_resched_softirq() was added in linux-2.6.11, you probably *need* to check the diffs between linux-2.6.10 & linux-2.6.11

files :

include/linux/sched.h
net/core/sock.c (__release_sock() latency)
net/ipv4/tcp_ipv4.c (/proc/net/tcp latency)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/