Re: [PATCH RFC net-next 0/3] Multi-CPU DSA support

From: Vladimir Oltean
Date: Sun Apr 11 2021 - 19:54:06 EST


On Sun, Apr 11, 2021 at 09:50:17PM +0300, Vladimir Oltean wrote:
> On Sun, Apr 11, 2021 at 08:01:35PM +0200, Marek Behun wrote:
> > On Sat, 10 Apr 2021 15:34:46 +0200
> > Ansuel Smith <ansuelsmth@xxxxxxxxx> wrote:
> >
> > > Hi,
> > > this is a respin of the Marek series in hope that this time we can
> > > finally make some progress with dsa supporting multi-cpu port.
> > >
> > > This implementation is similar to the Marek series but with some tweaks.
> > > This adds support for multiple-cpu port but leave the driver the
> > > decision of the type of logic to use about assigning a CPU port to the
> > > various port. The driver can also provide no preference and the CPU port
> > > is decided using a round-robin way.
> >
> > In the last couple of months I have been giving some thought to this
> > problem, and came up with one important thing: if there are multiple
> > upstream ports, it would make a lot of sense to dynamically reallocate
> > them to each user port, based on which user port is actually used, and
> > at what speed.
> >
> > For example on Turris Omnia we have 2 CPU ports and 5 user ports. All
> > ports support at most 1 Gbps. Round-robin would assign:
> > CPU port 0 - Port 0
> > CPU port 1 - Port 1
> > CPU port 0 - Port 2
> > CPU port 1 - Port 3
> > CPU port 0 - Port 4
> >
> > Now suppose that the user plugs ethernet cables only into ports 0 and 2,
> > with 1, 3 and 4 free:
> > CPU port 0 - Port 0 (plugged)
> > CPU port 1 - Port 1 (free)
> > CPU port 0 - Port 2 (plugged)
> > CPU port 1 - Port 3 (free)
> > CPU port 0 - Port 4 (free)
> >
> > We end up in a situation where ports 0 and 2 share 1 Gbps bandwidth to
> > CPU, and the second CPU port is not used at all.
> >
> > A mechanism for automatic reassignment of CPU ports would be ideal here.
> >
> > What do you guys think?
>
> The reason why I don't think this is such a great idea is because the
> CPU port assignment is a major reconfiguration step which should at the
> very least be done while the network is down, to avoid races with the
> data path (something which this series does not appear to handle).
> And if you allow the static user-port-to-CPU-port assignment to change
> every time a link goes up/down, I don't think you really want to force
> the network down through the entire switch basically.
>
> So I'd be tempted to say 'tough luck' if all your ports are not up, and
> the ones that are are assigned statically to the same CPU port. It's a
> compromise between flexibility and simplicity, and I would go for
> simplicity here. That's the most you can achieve with static assignment,
> just put the CPU ports in a LAG if you want better dynamic load balancing
> (for details read on below).

Just one more small comment, because I got so carried away with
describing what I already had in mind, that I forgot to completely
address your idea.

I think that DSA should provide the means to do what you want but not
the policy. Meaning that you can always write a user space program that
monitors the NETLINK_ROUTE rtnetlink through a socket and listens for
link state change events on it with poll(), then does whatever (like
moves the static user-to-CPU port mapping in the way that is adequate to
your network's requirements). The link up/down events are already
emitted, and the patch set here gives user space the rope to hang itself.

If you need inspiration, one user of the rtnetlink socket that I know of
is ptp4l:
https://github.com/richardcochran/linuxptp/blob/master/rtnl.c