Re: BUG at net/sunrpc/svc_xprt.c:921

From: Stanislav Kinsbursky
Date: Fri Jan 18 2013 - 00:37:05 EST


18.01.2013 03:41, Mark Lord ÐÐÑÐÑ:
On 13-01-17 08:24 AM, Stanislav Kinsbursky wrote:
..
This looks like the old issue I was trying to fix with "SUNRPC: protect service sockets lists during
per-net shutdown".
So, here is the problem as I see it: there is a transport, which is processed by service thread and
it's processing is racing with per-net service shutdown:

CPU#0: CPU#1:

svc_recv svc_close_net
svc_get_next_xprt (list_del_init(xpt_ready))
svc_close_list (set XPT_BUSY and XPT_CLOSE)
svc_clear_pools(xprt was gained on CPU#0 already)
svc_delete_xprt (set XPT_DEAD)
svc_handle_xprt (is XPT_CLOSE => svc_delete_xprt()
BUG()

So, from my POW, we need some way to:
1) Skip such in-progress transports on svc_close_net() call (there is not way to detect them, or at
least I don't see one)
2) Delete the transport after somewhere after svc_xprt_received()

But there is a problem with svc_xprt_received(): there is a call for svc_xprt_put() in it
(svc_recv->svc_handle_xprt->svc_xprt_received->svc_xprt_put) . And if we are the only user - then
the transport will be destroyed. But transport is dereferenced later in svc_recv() after the
svc_handle_xprt call.

Sounds like a reference count type of problem/solution (kref) (?)


No, it would be very simple.
Unluckily, the problem is more complex. In few words, the problem is in dynamic resources (transports) creation/attaching
and destruction/detaching for running (!) SUNRPC service.
You have more than one NFS mount in different network namespaces, haven't you?

--
Best regards,
Stanislav Kinsbursky
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/