Re: [Question] neighbor entry doesn't switch to the STALE state after the reachable timer expires

From: Julian Anastasov
Date: Mon Jan 30 2023 - 06:03:11 EST



Hello,

On Mon, 30 Jan 2023, Zhang Changzhong wrote:

> On 2023/1/30 3:43, Julian Anastasov wrote:
> >
> >> Does anyone have a good idea to solve this problem? Or are there other scenarios that might cause
> >> such a neighbor entry?
> >
> > Is the neigh entry modified somehow, for example,
> > with 'arp -s' or 'ip neigh change' ? Or is bond0 reconfigured
> > after initial setup? I mean, 4 days ago?>
>
> So far, we haven't found any user-space program that modifies the neigh
> entry or bond0.

Ouch, we do not need tools to hit the problem,
thanks to gc_thresh1.

> In fact, the neigh entry has been rarely used since initialization.
> 4 days ago, our machine just needed to download files from 172.16.1.18.
> However, the laddr has changed, and the neigh entry wrongly switched to
> NUD_REACHABLE state, causing the laddr to fail to update.

Received ARP packets should be able to change
the address. But we do not refresh the entry because
its timer is scheduled days ahead.

> > Looking at __neigh_update, there are few cases that
> > can assign NUD_STALE without touching neigh->confirmed:
> > lladdr = neigh->ha should be called, NEIGH_UPDATE_F_ADMIN
> > should be provided. Later, as you explain, it can wrongly
> > switch to NUD_REACHABLE state for long time.
> >
> > May be there should be some measures to keep
> > neigh->confirmed valid during admin modifications.
> >
>
> This problem can also occur if the neigh entry stays in NUD_STALE state
> for more than 99 days, even if it is not modified by the administrator.

I see.

> > What is the kernel version?
> >
>
> We encountered this problem in 4.4 LTS, and the mainline doesn't seem
> to fix it yet.

Yep, kernel version is irrelevant.

Here is a change that you can comment/test but
I'm not sure how many days (100?) are needed :) Not tested.

: From: Julian Anastasov <ja@xxxxxx>
Subject: [PATCH] neigh: make sure used and confirmed times are valid

Entries can linger without timer for days, thanks to
the gc_thresh1 limit. Later, on traffic, NUD_STALE entries
can switch to NUD_DELAY and start the timer which can see
confirmed time in the future causing switch to NUD_REACHABLE.
But then timer is started again based on the wrong
confirmed time, so days in the future. This is more
visible on 32-bit platforms.

Problem and the wrong state change reported by Zhang Changzhong:

172.16.1.18 dev bond0 lladdr 0a:0e:0f:01:12:01 ref 1 used 350521/15994171/350520 probes 4 REACHABLE

350520 seconds have elapsed since this entry was last updated, but it is
still in the REACHABLE state (base_reachable_time_ms is 30000),
preventing lladdr from being updated through probe.

Fix it by ensuring timer is started with valid used/confirmed
times. There are also places that need used/updated times to be
validated.

Reported-by: Zhang Changzhong <zhangchangzhong@xxxxxxxxxx>
Signed-off-by: Julian Anastasov <ja@xxxxxx>
---
net/core/neighbour.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index a77a85e357e0..f063e8b8fb7d 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -269,7 +269,7 @@ static int neigh_forced_gc(struct neigh_table *tbl)
(n->nud_state == NUD_NOARP) ||
(tbl->is_multicast &&
tbl->is_multicast(n->primary_key)) ||
- time_after(tref, n->updated))
+ !time_in_range(n->updated, tref, jiffies))
remove = true;
write_unlock(&n->lock);

@@ -289,7 +289,13 @@ static int neigh_forced_gc(struct neigh_table *tbl)

static void neigh_add_timer(struct neighbour *n, unsigned long when)
{
+ unsigned long mint = jiffies - MAX_JIFFY_OFFSET + 86400 * HZ;
+
neigh_hold(n);
+ if (!time_in_range(n->confirmed, mint, jiffies))
+ n->confirmed = mint;
+ if (time_before(n->used, n->confirmed))
+ n->used = n->confirmed;
if (unlikely(mod_timer(&n->timer, when))) {
printk("NEIGH: BUG, double timer add, state is %x\n",
n->nud_state);
@@ -982,12 +988,14 @@ static void neigh_periodic_work(struct work_struct *work)
goto next_elt;
}

- if (time_before(n->used, n->confirmed))
+ if (time_before(n->used, n->confirmed) &&
+ time_is_before_eq_jiffies(n->confirmed))
n->used = n->confirmed;

if (refcount_read(&n->refcnt) == 1 &&
(state == NUD_FAILED ||
- time_after(jiffies, n->used + NEIGH_VAR(n->parms, GC_STALETIME)))) {
+ !time_in_range_open(jiffies, n->used,
+ n->used + NEIGH_VAR(n->parms, GC_STALETIME)))) {
*np = n->next;
neigh_mark_dead(n);
write_unlock(&n->lock);
--
2.39.1