Re: [PATCH] tools/testing/selftests/bpf/test_tc_tunnel.sh: Prevent client connect before server bind

From: Martin KaFai Lau
Date: Wed Mar 13 2024 - 19:44:52 EST


On 3/10/24 1:45 AM, Alessandro Carminati wrote:
Hi Martin,
Thanks for the review.

Il giorno ven 8 mar 2024 alle ore 02:03 Martin KaFai Lau
<martin.lau@xxxxxxxxx> ha scritto:

On 2/29/24 6:00 AM, Alessandro Carminati (Red Hat) wrote:
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.

This explained what is the issue. Please also explain how the patch solves it.

The issue, as stated, depends on a race condition between the netcat client
and server. The test author addressed this problem using a sleep, which I
removed in this patch. To easily solve the issue, one could simply increase
the sleep duration. However, this patch opts to tackle the problem by
querying the /proc directory and verifying TCP binds at the specified port
before letting the client connect.

Please include this in the commit message.


Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@xxxxxxxxx>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 910044f08908..01c0f4b1a8c2 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,7 +72,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}

client_connect() {
@@ -93,6 +92,22 @@ verify_data() {
fi
}

+wait_for_port() {
+ local digits=8
+ local port2check=$(printf ":%04X" $1)
+ local prot=$([ "$2" == "-6" ] && echo 6 && digits=32)
+
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" cat /proc/net/tcp${prot} | \
+ sed -r 's/^[ \t]+[0-9]+: ([0-9A-F]{'${digits}'}:[0-9A-F]{4}) .*$/\1/' | \
+ grep -q "${port2check}"; then

The idea is to check if there is socket listening on port 8888?

May be something simpler like "ss -OHtl src :$1" instead?
Indeed, the aim is to ensure that the server is bound before the
client attempts to
connect by checking if socket is listening, and yes using 'ss' would be shorter.
However, I chose not to use 'ss' or 'netstat' to avoid adding new dependencies,
considering they are already many.

ss should be in the same iproute package that "(ip) netns..." lives in also. The above changes added sed and grep.

Regardless, this external dependency will be all gone once moved to the selftests/test_progs. The check-and-wait will be gone by creating a listen fd instead of using "nc -l...". A simpler change such that people doing the future test_progs migration will have an easier time.

The check-and-wait fix in this patch is fine to get your test environment going.

Eventually, it will be good to see the test_tc_tunnel.sh test moved to
test_progs. The test_tc_tunnel.sh is not run by bpf CI and issue like this got
unnoticed. Some other "*.sh" tests have already been moved to test_progs.


Regards
Alessandro