Re: [PATCH v2 net-next 21/26] ice: add XDP and XSK generic per-channel statistics

From: Daniel Borkmann
Date: Fri Nov 26 2021 - 18:03:38 EST


On 11/26/21 11:27 PM, Daniel Borkmann wrote:
On 11/26/21 7:06 PM, Jakub Kicinski wrote:
[...]
The information required by the admin is higher level. As you say the
primary concern there is "how many packets did XDP eat".

Agree. Above said, for XDP_DROP I would see one use case where you compare
different drivers or bond vs no bond as we did in the past in [0] when
testing against a packet generator (although I don't see bond driver covered
in this series here yet where it aggregates the XDP stats from all bond slave
devs).

On a higher-level wrt "how many packets did XDP eat", it would make sense
to have the stats for successful XDP_{TX,REDIRECT} given these are out
of reach from a BPF prog PoV - we can only count there how many times we
returned with XDP_TX but not whether the pkt /successfully made it/.

In terms of error cases, could we just standardize all drivers on the behavior
of e.g. mlx5e_xdp_handle(), meaning, a failure from XDP_{TX,REDIRECT} will
hit the trace_xdp_exception() and then fallthrough to bump a drop counter
(same as we bump in XDP_DROP then). So the drop counter will account for
program drops but also driver-related drops.

At some later point the trace_xdp_exception() could be extended with an error
code that the driver would propagate (given some of them look quite similar
across drivers, fwiw), and then whoever wants to do further processing with
them can do so via bpftrace or other tooling.

Just thinking out loud, one straight forward example we could start out with
that is also related to Paolo's series [1] ...

enum xdp_error {
XDP_UNKNOWN,
XDP_ACTION_INVALID,
XDP_ACTION_UNSUPPORTED,
};

... and then bpf_warn_invalid_xdp_action() returns one of the latter two
which we pass to trace_xdp_exception(). Later there could be XDP_DRIVER_*
cases e.g. propagated from XDP_TX error exceptions.

[...]
default:
err = bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
xdp_abort:
trace_xdp_exception(rq->netdev, prog, act, err);
fallthrough;
case XDP_DROP:
lrstats->xdp_drop++;
break;
}
[...]

[1] https://lore.kernel.org/netdev/cover.1637924200.git.pabeni@xxxxxxxxxx/

So overall wrt this series: from the lrstats we'd be /dropping/ the pass,
tx_errors, redirect_errors, invalid, aborted counters. And we'd be /keeping/
bytes & packets counters that XDP sees, (driver-)successful tx & redirect
counters as well as drop counter. Also, XDP bytes & packets counters should
not be counted twice wrt ethtool stats.

  [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9e2ee5c7e7c35d195e2aa0692a7241d47a433d1e