RE: [PATCH v3] bonding: force enable lacp port after link state recovery for 802.3ad

From: zhangsha (A)
Date: Wed Sep 18 2019 - 09:35:47 EST




> -----Original Message-----
> From: zhangsha (A)
> Sent: 2019年9月18日 21:06
> To: jay.vosburgh@xxxxxxxxxxxxx; vfalico@xxxxxxxxx; andy@xxxxxxxxxxxxx;
> davem@xxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> yuehaibing <yuehaibing@xxxxxxxxxx>; hunongda <hunongda@xxxxxxxxxx>;
> Chenzhendong (alex) <alex.chen@xxxxxxxxxx>; zhangsha (A)
> <zhangsha.zhang@xxxxxxxxxx>
> Subject: [PATCH v3] bonding: force enable lacp port after link state recovery for
> 802.3ad
>
> From: Sha Zhang <zhangsha.zhang@xxxxxxxxxx>
>
> After the commit 334031219a84 ("bonding/802.3ad: fix slave link initialization
> transition states") merged, the slave's link status will be changed to
> BOND_LINK_FAIL from BOND_LINK_DOWN in the following scenario:
> - Driver reports loss of carrier and
> bonding driver receives NETDEV_DOWN notifier
> - slave's duplex and speed is zerod and
> its port->is_enabled is cleard to 'false';
> - Driver reports link recovery and
> bonding driver receives NETDEV_UP notifier;
> - If speed/duplex getting failed here, the link status
> will be changed to BOND_LINK_FAIL;
> - The MII monotor later recover the slave's speed/duplex
> and set link status to BOND_LINK_UP, but remains
> the 'port->is_enabled' to 'false'.
>
> In this scenario, the lacp port will not be enabled even its speed and duplex are
> valid. The bond will not send LACPDU's, and its state is 'AD_STATE_DEFAULTED'
> forever. The simplest fix I think is to call bond_3ad_handle_link_change() in
> bond_miimon_commit, this function can enable lacp after port slave speed
> check.
> As enabled, the lacp port can run its state machine normally after link recovery.
>
> Signed-off-by: Sha Zhang <zhangsha.zhang@xxxxxxxxxx>
> ---
> drivers/net/bonding/bond_main.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/bonding/bond_main.c
> b/drivers/net/bonding/bond_main.c index 931d9d9..76324a5 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2206,7 +2206,8 @@ static void bond_miimon_commit(struct bonding
> *bond)
> */
> if (BOND_MODE(bond) == BOND_MODE_8023AD &&
> slave->link == BOND_LINK_UP)
> -
> bond_3ad_adapter_speed_duplex_changed(slave);
> + bond_3ad_handle_link_change(slave,
> + BOND_LINK_UP);
> continue;
>
> case BOND_LINK_UP:

Hi, David,
I have replied your email for a while, I guess you may miss my email, so I resend it.
The following link address is the last email, please review the new one again, thank you.
https://patchwork.ozlabs.org/patch/1151915/

Last time, you doubted this is a driver specific problem,
I prefer to believe it's not because I find the commit 4d2c0cda,
its log says " Some NIC drivers don't have correct speed/duplex
settings at the time they send NETDEV_UP notification ...".

Anyway, I think the lacp status should be fixed correctly,
since link-monitoring (miimon) set SPEED/DUPLEX right here.

> --
> 1.8.3.1