Re: debug: nt_conntrack and KVM crash

From: Jon Masters
Date: Sat Jan 30 2010 - 05:03:32 EST


On Sat, 2010-01-30 at 09:33 +0100, Eric Dumazet wrote:
> Le samedi 30 janvier 2010 Ã 02:36 -0500, Jon Masters a Ãcrit :
>
> > I'll play later. Right now, I'm looking over every iptables/ip call
> > libvirt makes - it explicitly plays with the netns for the loopback,
> > which looks interesting. Supposing it does cause the hashtables to get
> > unintentionally zereod or the sizing to get wiped out, we should also
> > nonetheless catch the case that the hash function generates a whacko
> > number or that the hash size is set to zero when we want to use it.

> I asked you if you had multiple namespaces, because I was not sure
> conntracking hash was global (shared by all namespaces), or local.

Well, I didn't think I had multiple namespaces, and in fact I don't see
more than one in gdb when I poke at the net struct. What I see libvirt
doing (very oddly indeed - looking at the source now) is calling ip and
asking for the lo device to be moved into the netns for pid "-1", which
isn't valid AFAIK (should be a valid pid, unless "-1" is supposed to be
"this process" or something, haven't played with multiple namespaces).

I'll do some more digging (network stuff isn't my area) now and come
back. It only reproduces if multiple VMs start at once (hence a race,
perhaps as you describe) whereas if I disable autostart and let them
come up one at a time, the box doesn't roll over.

Jon.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/