Re: [PATCH 0/2] NFSD: fix races in service per-net resourcesallocation

From: J. Bruce Fields
Date: Mon Feb 11 2013 - 11:37:27 EST


On Mon, Feb 11, 2013 at 10:18:18AM +0400, Stanislav Kinsbursky wrote:
> This one looks a bit complicated and confusing to me. Probably because
> I'm not that familiar with service transports processing logic. So,
> as I can see, we now try to run over all per-net pool-assigned
> transports, remove them from "ready" queue and delete one by one.
> Then we try to enqueue all temporary sockets. But where in enqueueing
> of permanent sockets? I.e. how does they be destroyed with this patch?
> Then we once again try to run over all per-net pool-assigned
> transports, remove them from "ready" queue and delete one by one. Why
> twice? I.e. why not just lose them, then enqueue them and
> svc_clean_up_xprts()?

I think you missed the first svc_close_list?:

> > svc_close_list(serv, &serv->sv_permsocks, net);
> >+ svc_clean_up_xprts(serv, net);
> >+ svc_close_list(serv, &serv->sv_tempsocks, net);
> >+ svc_clean_up_xprts(serv, net);

The idea is that before we'd like to close all the listeners first, so
that they aren't busy creating more tempsocks while we're trying to
close them.

I overlooked a race, though: if another thread was already handling an
accept for one of the listeners then it might not get closed by that
first svc_clean_up_xprts.

I guess we could do something like:

delay = 0;

again:
numclosed = svc_close_list(serv, &serv->sv_permsocks, net);
numclosed += svc_close_list(serv, &serv->sv_tempsocks, net);
if (numclosed) {
svc_clean_up_xprts(serv, net);
msleep(delay++);
goto again;
}

Seems a little cheesy, but if we don't care much about shutdown
performance in a rare corner case, maybe it's the simplest way out?

--b
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/