Re: [PATCH] net/xfrm/xfrm_ipcomp: Use {get,put}_cpu_light

From: Sebastian Andrzej Siewior
Date: Tue Aug 13 2019 - 11:59:05 EST


On 2019-07-17 09:20:19 [+0200], Juri Lelli wrote:
> The following BUG has been reported while running ipsec tests.
â
> Hi,
>
> This has been found on a 4.19.x-rt kernel, but 5.x-rt(s) are affected as
> well.
>
> Best,
>
> Juri
> ---
> net/xfrm/xfrm_ipcomp.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
> index a00ec715aa46..39d9e663384f 100644
> --- a/net/xfrm/xfrm_ipcomp.c
> +++ b/net/xfrm/xfrm_ipcomp.c
> @@ -45,7 +45,7 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
> const int plen = skb->len;
> int dlen = IPCOMP_SCRATCH_SIZE;
> const u8 *start = skb->data;
> - const int cpu = get_cpu();
> + const int cpu = get_cpu_light();

By using get_cpu_light() you don't forbid another function to invoke
ipcomp_decompress() on the same CPU. That means that

> u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);

scratch buffer here could be used by two tasks on the same CPU. You are
aware of that right?
According to your backtrace you get here from NAPI which means BH which
means it is enough to use smp_processor_id() in such a case.

ipcomp_compress() is using the very same buffer while invoking
local_bh_disable() before using the buffer to ensure nothing else is
using the buffer on this CPU. This will work in v5.2-RT because the new
softirq code uses a local_lock() as part of local_bh_disable(). This
does not work on v4.19-RT and earlier.

For v4.19 and earlier I suggest to use a local_lock().
For v5.2 and later I suggest to replace get_cpu() with
smp_processor_id(). Ideally a with a lockdep annotation to ensure that
BH is disabled (which we don't have).

> struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
> int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);

Sebastian