Re: [PATCH v2 14/14] net: ethernet: mtk_eth_soc: support creating mac address based offload entries

From: Felix Fietkau
Date: Tue Apr 12 2022 - 09:50:33 EST



On 12.04.22 15:07, Andrew Lunn wrote:
> > > > I'm trying to understand the architecture here.
> > > > We have an Ethernet interface and a Wireless interface. The slow
> > path
> > > is that frames ingress from one of these interfaces, Linux decides
> > > what to do with them, either L2 or L3, and they then egress probably
> > > out the other interface.
> > > > The hardware will look at the frames and try to spot flows? It
> > will
> > > then report any it finds. You can then add an offload, telling it for
> > > a flow it needs to perform L2 or L3 processing, and egress out a
> > > specific port? Linux then no longer sees the frame, the hardware
> > > handles it, until the flow times out?
> > Yes, the hw handles it until either the flow times out, or the corresponding
> > offload entry is removed.
> > > > For OpenWrt I also wrote a daemon that uses tc classifier BPF to accelerate
> > the software bridge and create hardware offload entries as well via hardware
> > TC flower rules: https://github.com/nbd168/bridger
> > It works in combination with these changes.
> > What about the bridge? In Linux, it is the software bridge which
> controls all this at L2, and it should be offloading the flows, via
> switchdev. The egress port you derive here is from the software bridge
> FDB?

My code uses netlink to fetch and monitor the bridge configuration,
including fdb, port state, vlans, etc. and it uses that for the offload path
- no extra configuration needed.

So this is where we get into architecture issues. Do we really want
Linux to have two ways for setting up L2 networking? It was decided
that users should not need to know about how to use an accelerator,
they should not use additional tools, it should just look like
linux. The user should just add the WiFi netdev to the bridge and
switchdev will do the rest to offload L2 switching to the hardware.

You appear to be saying you need a daemon in userspace. That is not
how every other accelerate works in Linux networking.

We the Linux network community need to decided if we want this?
The problem here is that it can't be fully transparent. Enabling hardware offload for LAN -> WiFi comes at a cost of bypassing airtime fairness and mac80211's bufferbloat mitigation.
Some people want this anyway (often but not always for benchmark/marketing purposes), but it's not something that I would want to have enabled by default simply by a wifi netdev to a bridge.

Initially, I wanted to put more of the state tracking code in the kernel. I made the first implementation of my acceleration code as a patch to the network bridge - speeding up bridge unicast forwarding significantly for any device regardless of hardware support. I wanted to build on that to avoid putting a lot of FDB/VLAN related tracking directly into the driver.

That approach was immediately rejected and I was told to use BPF instead.

That said, I really don't think it's a good idea to put all the code for tracking the bridge state, and all possible forwarding destinations into the driver directly.

I believe the combination of doing the bridge state tracking in user space + using the standard TC API for programming offloading rules into the hardware is a reasonable compromise.

- Felix