Re: [PATCH v2 bpf] bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES

From: Alexander Lobakin
Date: Tue Feb 14 2023 - 11:07:45 EST


From: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx>
Date: Tue, 14 Feb 2023 16:39:25 +0100

> From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Date: Tue, 14 Feb 2023 16:24:10 +0100
>
>> On 2/13/23 3:27 PM, Alexander Lobakin wrote:

[...]

>>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in
>>> BPF_PROG_RUN")
>>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx>
>>
>> Could you double check BPF CI? Looks like a number of XDP related tests
>> are failing on your patch which I'm not seeing on other patches where runs
>> are green, for example test_progs on several archs report the below:
>>
>> https://github.com/kernel-patches/bpf/actions/runs/4164593416/jobs/7207290499
>>
>>   [...]
>>   test_xdp_do_redirect:PASS:prog_run 0 nsec
>>   test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec
>>   test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec
>>   test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec
>>   test_max_pkt_size:PASS:prog_run_max_size 0 nsec
>>   test_max_pkt_size:FAIL:prog_run_too_big unexpected prog_run_too_big:
>> actual -28 != expected -22
>>   close_netns:PASS:setns 0 nsec
>>   #275     xdp_do_redirect:FAIL
>>   Summary: 273/1581 PASSED, 21 SKIPPED, 2 FAILED
> Ah I see. xdp_do_redirect.c test defines:
>
> /* The maximum permissible size is: PAGE_SIZE -
> * sizeof(struct xdp_page_head) - sizeof(struct skb_shared_info) -
> * XDP_PACKET_HEADROOM = 3368 bytes
> */
> #define MAX_PKT_SIZE 3368
>
> This needs to be updated as it now became bigger. The test checks that
> this size passes and size + 1 fails, but now it doesn't.
> Will send v3 in a couple minutes.

Problem :s

This 3368/3408 assumes %L1_CACHE_BYTES is 64 and we're running on a
64-bit arch. For 32 bits the value will be bigger, also for cachelines
bigger than 64 it will be smaller (skb_shared_info has to be aligned).
Given that selftests are generic / arch-independent, how to approach
this? I added a static_assert() to test_run.c to make sure this value
is in sync to not run into the same problem in future, but then realized
it will fail on a number of architectures.

My first thought was to hardcode the worst-case value (64 bit, cacheline
is 128) in test_run.c for every architecture, but there might be more
elegant ways.

>
> Thanks,
> Olek
Thanks,
Olek