Re: [RFC PATCH net-next V2 0/6] XDP rx handler

From: Jesper Dangaard Brouer
Date: Tue Aug 14 2018 - 09:27:47 EST


On Tue, 14 Aug 2018 15:59:01 +0800
Jason Wang <jasowang@xxxxxxxxxx> wrote:

> On 2018å08æ14æ 08:32, Alexei Starovoitov wrote:
> > On Mon, Aug 13, 2018 at 11:17:24AM +0800, Jason Wang wrote:
> >> Hi:
> >>
> >> This series tries to implement XDP support for rx hanlder. This would
> >> be useful for doing native XDP on stacked device like macvlan, bridge
> >> or even bond.
> >>
> >> The idea is simple, let stacked device register a XDP rx handler. And
> >> when driver return XDP_PASS, it will call a new helper xdp_do_pass()
> >> which will try to pass XDP buff to XDP rx handler directly. XDP rx
> >> handler may then decide how to proceed, it could consume the buff, ask
> >> driver to drop the packet or ask the driver to fallback to normal skb
> >> path.
> >>
> >> A sample XDP rx handler was implemented for macvlan. And virtio-net
> >> (mergeable buffer case) was converted to call xdp_do_pass() as an
> >> example. For ease comparision, generic XDP support for rx handler was
> >> also implemented.
> >>
> >> Compared to skb mode XDP on macvlan, native XDP on macvlan (XDP_DROP)
> >> shows about 83% improvement.
> > I'm missing the motiviation for this.
> > It seems performance of such solution is ~1M packet per second.
>
> Notice it was measured by virtio-net which is kind of slow.
>
> > What would be a real life use case for such feature ?
>
> I had another run on top of 10G mlx4 and macvlan:
>
> XDP_DROP on mlx4: 14.0Mpps
> XDP_DROP on macvlan: 10.05Mpps
>
> Perf shows macvlan_hash_lookup() and indirect call to
> macvlan_handle_xdp() are the reasons for the number drop. I think the
> numbers are acceptable. And we could try more optimizations on top.
>
> So here's real life use case is trying to have an fast XDP path for rx
> handler based device:
>
> - For containers, we can run XDP for macvlan (~70% of wire speed). This
> allows a container specific policy.
> - For VM, we can implement macvtap XDP rx handler on top. This allow us
> to forward packet to VM without building skb in the setup of macvtap.
> - The idea could be used by other rx handler based device like bridge,
> we may have a XDP fast forwarding path for bridge.
>
> >
> > Another concern is that XDP users expect to get line rate performance
> > and native XDP delivers it. 'generic XDP' is a fallback only
> > mechanism to operate on NICs that don't have native XDP yet.
>
> So I can replace generic XDP TX routine with a native one for macvlan.

If you simply implement ndo_xdp_xmit() for macvlan, and instead use
XDP_REDIRECT, then we are basically done.


> > Toshiaki's veth XDP work fits XDP philosophy and allows
> > high speed networking to be done inside containers after veth.
> > It's trying to get to line rate inside container.
>
> This is one of the goal of this series as well. I agree veth XDP work
> looks pretty fine, but it only work for a specific setup I believe since
> it depends on XDP_REDIRECT which is supported by few drivers (and
> there's no VF driver support).

The XDP_REDIRECT (RX-side) is trivial to add to drivers. It is a bad
argument that only a few drivers implement this. Especially since all
drivers also need to be extended with your proposed xdp_do_pass() call.

(rant) The thing that is delaying XDP_REDIRECT adaption in drivers, is
that it is harder to implement the TX-side, as the ndo_xdp_xmit() call
have to allocate HW TX-queue resources. If we disconnect RX and TX
side of redirect, then we can implement RX-side in an afternoon.


> And in order to make it work for a end
> user, the XDP program still need logic like hash(map) lookup to
> determine the destination veth.

That _is_ the general idea behind XDP and eBPF, that we need to add logic
that determine the destination. The kernel provides the basic
mechanisms for moving/redirecting packets fast, and someone else
builds an orchestration tool like Cilium, that adds the needed logic.

Did you notice that we (Ahern) added bpf_fib_lookup a FIB route lookup
accessible from XDP.

For macvlan, I imagine that we could add a BPF helper that allows you
to lookup/call macvlan_hash_lookup().


> > This XDP rx handler stuff is destined to stay at 1Mpps speeds forever
> > and the users will get confused with forever slow modes of XDP.
> >
> > Please explain the problem you're trying to solve.
> > "look, here I can to XDP on top of macvlan" is not an explanation of the problem.
> >


--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer