Re: [PATCH RFC net-next 0/3] Multi-CPU DSA support

From: Tobias Waldekranz
Date: Tue Apr 13 2021 - 14:16:37 EST


On Tue, Apr 13, 2021 at 17:14, Marek Behun <marek.behun@xxxxxx> wrote:
> On Tue, 13 Apr 2021 16:46:32 +0200
> Tobias Waldekranz <tobias@xxxxxxxxxxxxxx> wrote:
>
>> On Tue, Apr 13, 2021 at 02:27, Marek Behun <marek.behun@xxxxxx> wrote:
>> > On Tue, 13 Apr 2021 01:54:50 +0200
>> > Marek Behun <marek.behun@xxxxxx> wrote:
>> >
>> >> I will look into this, maybe ask some follow-up questions.
>> >
>> > Tobias,
>> >
>> > it seems that currently the LAGs in mv88e6xxx driver do not use the
>> > HashTrunk feature (which can be enabled via bit 11 of the
>> > MV88E6XXX_G2_TRUNK_MAPPING register).
>>
>> This should be set at the bottom of mv88e6xxx_lag_sync_masks.
>>
>> > If we used this feature and if we knew what hash function it uses, we
>> > could write a userspace tool that could recompute new MAC
>> > addresses for the CPU ports in order to avoid the problem I explained
>> > previously...
>> >
>> > Or the tool can simply inject frames into the switch and try different
>> > MAC addresses for the CPU ports until desired load-balancing is reached.
>> >
>> > What do you think?
>>
>> As you concluded in your followup, not being able to have a fixed MAC
>> for the CPU seems weird.
>>
>> Maybe you could do the inverse? Allow userspace to set the masks for an
>> individual bond/team port in a hash-based LAG, then you can offload that
>> to DSA.
>
> What masks?

The table defined in Global2/Register7.

When a frame is mapped to a LAG (e.g. by an ATU lookup), all member
ports will added to the frame's destination vector. The mask table is
the block that then filters the vector to only include a single
member.

By modifying that table, you can choose which buckets are assigned to
which member ports. This includes assigning 7 buckets to one member and
1 to the other for example.

At the moment, mv88e6xxx will statically determine this mapping (in
mv88e6xxx_lag_set_port_mask), by trying to spread the buckets as evenly
as possible. It will also rebalance the assignments whenever a link goes
down, or is "detached" in LACP terms.

You could imagine a different mode in which the DSA driver would receive
the bucket allocation from the bond/team driver (which in turn could
come all the way from userspace). Userspace could then implement
whatever strategy it wants to maximize utilization, though still bound
by the limitations of the hardware in terms of fields considered during
hashing of course.