Re: Any changes in Multicast code between 2.4.20 and 2.4.22/23 ? -> New Info

From: Martin Knoblauch
Date: Wed Jan 07 2004 - 09:09:43 EST


--- David Stevens <dlstevens@xxxxxxxxxx> wrote:
>
>
>
>
> Martin,
> Can you test with a recent 2.6 kernel? There have been
> a number of fixes applied there in recent months, and I
> haven't yet verified that all of those are in the 2.4 line yet.

Ahhh. That would be a major venture. It would actually be my first
encounter with building 2.6. I am afraid I cannot do that on the
machines in question. Not now :-(

> All multicast delivery is not broken, and the IGMP version is
> superficially irrelevant to multicast delivery on a local network,
> unless you have a "smart" switch that relies on IGMP reports
> to determine which ports to deliver multicasts
> on (and at the same time, one that doesn't understand IGMPv3).

No idea how smart the Cisco box is that the customer provided us with.
But. it is the same switch that works wit 2.4.21.

> But neither of your tcpdump traces show any IGMP traffic, meaning
> that you probably don't have "IP multicasting" turned on. That
> just means it won't do IGMP, which is fine.

Hmm. What do you mean by "do not have IP multicasting turned on"? It
is definitely compiled in.

CONFIG_IP_MULTICAST=y is set in the kernel config file.Would there be
anything needed at runtime?

> There were some unwanted side-effects in multicast delivery
> because of the source filtering but I'm pretty sure those fixes
> are in the 2.4 line.
>
> I have no idea what your application is supposed to be doing.
> Can you characterize it in some way and/or come up with a
> small test program that reproduces the failure? What are the
> receivers bound to (INADDR_ANY or the multicast address--
> "netstat -a" output would be helpful here)? Is the join done
> in the same process on the same socket or a different socket or
> different process?

The best I can do is pointing you to the source of Ganglia.

http://ganglia.sourceforge.net/

"netstat -a" from good/bad machines is appended. Not much to see.

> You earlier gave /proc/net/igmp output that showed the join was
> apparently successful. Does "netstat -s" show any "drop"
> statistics going up?

"netstat -s" does not show anything that I would recognize as "drop
statisitcs". Basically everything looks OK. All counters that I would
associate with failures seem to be stable. In any case, output is
appended for the "bad" machine.

> Also, can you run tcpdump on both machines when it is failing
> and see if packets sent from one machine are all showing up on the
> receiver machine?

Hmm. I thought that was visible from the tcpdumps I sent?

> And, just to be thorough, are you possibly using netfilter?
>

It was compiled in. I have now removed it, but it makes no difference.

Cheers
Martin

=====
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de

Attachment: netstat-a.bad
Description: netstat-a.bad

Attachment: netstat-s.bad
Description: netstat-s.bad

Attachment: netstat-a.good
Description: netstat-a.good