Re: Possible netlink autobind regression

From: Tejun Heo
Date: Thu Sep 17 2015 - 07:30:44 EST


Hello, Herbert.

On Thu, Sep 17, 2015 at 01:15:03PM +0800, Herbert Xu wrote:
> netlink: Fix autobind race condition that leads to zero port ID
>
> The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink:
> Reset portid after netlink_insert failure") introduced a race
> condition where if two threads tried to autobind the same socket
> one of them may end up with a zero port ID.
>
> This patch reverts that commit and instead fixes it by introducing
> a separte "bound" variable to indicate whether a user-space socket
> has been bound.
>
> Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure")
> Reported-by: Tejun Heo <tj@xxxxxxxxxx>
> Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Reviewed-by: Cong Wang <cwang@xxxxxxxxxxxxxxxx>

Maybe add that this led to a deadlock and add a Link tag to this
thread?

> @@ -1083,10 +1083,12 @@ static int netlink_insert(struct sock *sk, u32 portid)
> if (err) {
> if (err == -EEXIST)
> err = -EADDRINUSE;
> - nlk_sk(sk)->portid = 0;
> sock_put(sk);
> + goto err;
> }
>
> + nlk_sk(sk)->bound = !!portid;

!! isn't necessasry and this creates ordering between two stores.
->bound must be visible only after ->portid is visible, so this should
be smp_store_release().

> @@ -2371,7 +2373,7 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
> dst_group = nlk->dst_group;
> }
>
> - if (!nlk->portid) {
> + if (!nlk->bound) {

And all unlocked reads should be smp_load_acquire().

> err = netlink_autobind(sock);
> if (err)
> goto out;

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/