Re: [Question] neighbor entry doesn't switch to the STALE state after the reachable timer expires

From: Zhang Changzhong
Date: Tue Jan 31 2023 - 06:52:52 EST


On 2023/1/30 19:01, Julian Anastasov wrote:> On Mon, 30 Jan 2023, Zhang Changzhong wrote:
>
>> On 2023/1/30 3:43, Julian Anastasov wrote:
>>>
>>>> Does anyone have a good idea to solve this problem? Or are there other scenarios that might cause
>>>> such a neighbor entry?
>>>
>>> Is the neigh entry modified somehow, for example,
>>> with 'arp -s' or 'ip neigh change' ? Or is bond0 reconfigured
>>> after initial setup? I mean, 4 days ago?>
>>
>> So far, we haven't found any user-space program that modifies the neigh
>> entry or bond0.
>
> Ouch, we do not need tools to hit the problem,
> thanks to gc_thresh1.
>
>> In fact, the neigh entry has been rarely used since initialization.
>> 4 days ago, our machine just needed to download files from 172.16.1.18.
>> However, the laddr has changed, and the neigh entry wrongly switched to
>> NUD_REACHABLE state, causing the laddr to fail to update.
>
> Received ARP packets should be able to change
> the address. But we do not refresh the entry because
> its timer is scheduled days ahead.
>

Yep.

>>> Looking at __neigh_update, there are few cases that
>>> can assign NUD_STALE without touching neigh->confirmed:
>>> lladdr = neigh->ha should be called, NEIGH_UPDATE_F_ADMIN
>>> should be provided. Later, as you explain, it can wrongly
>>> switch to NUD_REACHABLE state for long time.
>>>
>>> May be there should be some measures to keep
>>> neigh->confirmed valid during admin modifications.
>>>
>>
>> This problem can also occur if the neigh entry stays in NUD_STALE state
>> for more than 99 days, even if it is not modified by the administrator.
>
> I see.
>
>>> What is the kernel version?
>>>
>>
>> We encountered this problem in 4.4 LTS, and the mainline doesn't seem
>> to fix it yet.
>
> Yep, kernel version is irrelevant.
>
> Here is a change that you can comment/test but
> I'm not sure how many days (100?) are needed :) Not tested.
>
> : From: Julian Anastasov <ja@xxxxxx>
> Subject: [PATCH] neigh: make sure used and confirmed times are valid
>
> Entries can linger without timer for days, thanks to
> the gc_thresh1 limit. Later, on traffic, NUD_STALE entries
> can switch to NUD_DELAY and start the timer which can see
> confirmed time in the future causing switch to NUD_REACHABLE.
> But then timer is started again based on the wrong
> confirmed time, so days in the future. This is more
> visible on 32-bit platforms.
>
> Problem and the wrong state change reported by Zhang Changzhong:
>
> 172.16.1.18 dev bond0 lladdr 0a:0e:0f:01:12:01 ref 1 used 350521/15994171/350520 probes 4 REACHABLE
>
> 350520 seconds have elapsed since this entry was last updated, but it is
> still in the REACHABLE state (base_reachable_time_ms is 30000),
> preventing lladdr from being updated through probe.
>
> Fix it by ensuring timer is started with valid used/confirmed
> times. There are also places that need used/updated times to be
> validated.
>
> Reported-by: Zhang Changzhong <zhangchangzhong@xxxxxxxxxx>
> Signed-off-by: Julian Anastasov <ja@xxxxxx>
> ---
> net/core/neighbour.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index a77a85e357e0..f063e8b8fb7d 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -269,7 +269,7 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> (n->nud_state == NUD_NOARP) ||
> (tbl->is_multicast &&
> tbl->is_multicast(n->primary_key)) ||
> - time_after(tref, n->updated))
> + !time_in_range(n->updated, tref, jiffies))
> remove = true;
> write_unlock(&n->lock);
>
> @@ -289,7 +289,13 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>
> static void neigh_add_timer(struct neighbour *n, unsigned long when)
> {
> + unsigned long mint = jiffies - MAX_JIFFY_OFFSET + 86400 * HZ;
> +
> neigh_hold(n);
> + if (!time_in_range(n->confirmed, mint, jiffies))
> + n->confirmed = mint;
> + if (time_before(n->used, n->confirmed))
> + n->used = n->confirmed;
> if (unlikely(mod_timer(&n->timer, when))) {
> printk("NEIGH: BUG, double timer add, state is %x\n",
> n->nud_state);
> @@ -982,12 +988,14 @@ static void neigh_periodic_work(struct work_struct *work)
> goto next_elt;
> }
>
> - if (time_before(n->used, n->confirmed))
> + if (time_before(n->used, n->confirmed) &&
> + time_is_before_eq_jiffies(n->confirmed))
> n->used = n->confirmed;
>
> if (refcount_read(&n->refcnt) == 1 &&
> (state == NUD_FAILED ||
> - time_after(jiffies, n->used + NEIGH_VAR(n->parms, GC_STALETIME)))) {
> + !time_in_range_open(jiffies, n->used,
> + n->used + NEIGH_VAR(n->parms, GC_STALETIME)))) {
> *np = n->next;
> neigh_mark_dead(n);
> write_unlock(&n->lock);
>

Thanks for your fix!

I reproduced this problem by directly modifying the confirmed time of
the neigh entry in STALE state. Your change can fix the problem in this
scenario.

Just curious, why did you choose 'jiffies - MAX_JIFFY_OFFSET + 86400 * HZ'
as the value of 'mint'?

Regards,
Changzhong