Re: [netlink] WARNING: at mm/vmalloc.c:1487 __vunmap()

From: Eric Dumazet
Date: Thu Jun 27 2013 - 04:30:48 EST


On Thu, 2013-06-27 at 10:22 +0200, Pablo Neira Ayuso wrote:
> Hi Eric,
>
> Thanks for looking into this.
>
> On Wed, Jun 26, 2013 at 05:42:38AM -0700, Eric Dumazet wrote:
> [...]
> > Nope there are several issues :
> >
> > 1) bug in netlink_alloc_large_skb() because it doesn't account
> > for sizeof(struct skb_shared_info) overhead and initialization.
>
> Indeed, I can send a fix for this.
>
> > 2) Also, skb_clone() on such skb should be forbidden.
> >
> > Example, nl_fib_input() does a nskb = skb_clone(skb)
> >
> > If skb is freed before nskb, then nskb wont know skb->head must be freed
> > by vfree()
> >
> > I don't know...
> >
> > 3) Do we really need this vmalloc stuff, because it sounds like we are
> > going to add yet another test in fast path (in skb_free_head())
>
> We want to send atomic rule-set updates via netlink in one single
> batch message to kernel space. Without vmalloc, I can send up to
> ~20000 rule updates in one single batch.
>
> We considered splitting the updates in smaller batches to make netlink
> happy, but then a process has to own the rule-set base until it has
> finished the update to avoid any interference. However, a broken
> user-space program may (ab)use such ownership to prevents others from
> updating the rule-set.
>
> > 4) Or we must track all skb_clone() netlink calls to attach a destructor
> > to properly to the vfree()
>
> Perhaps we can add a new specific function for this, netlink_skb_clone?
>

You have also to track the kfree_skb() calls done before
you set skb->destructor.

Or set skb->destructor right after netlink_alloc_large_skb()

> I'll be fine to track skb_clone in existing netlink families and
> replace it by such call in case you don't find this solution too
> hackish.

Let see if you can do that, I'll test and review the patches.

I suggest you use build_skb() as in :

static struct sk_buff *netlink_alloc_large_skb(unsigned int size, bool broadcast)
{
struct sk_buff *skb;
void *data;

if (size <= NLMSG_GOODSIZE || broadcast)
return alloc_skb(size, GFP_KERNEL);

size = SKB_DATA_ALIGN(size) +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));

data = vmalloc(size);
if (!data)
return NULL;
skb = build_skb(data, size);
if (!skb)
vfree(data);
else
skb->head_frag = 0;
return skb;
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/