[quagga-dev 3670] Re: Patch to fix multicast problem on FreeBSD 5.x

Andrew J. Schorr aschorr at telemetry-investments.com
Wed Sep 21 15:06:01 BST 2005


On Tue, Aug 16, 2005 at 08:31:49PM -0400, Greg Troxel wrote:
> > From: Nate Nielsen <nielsen-list at memberwebs.com>
> > > I wonder about keeping a flag in user space representing the state of
> > > join/leave for the group, and doing a drop first if it is recorded as
> > > joined.  The problem here is if an interface is taken down and another
> > > is brought up with the same address.  So really I think problem (1)
> > > needs to be fixed.
> > 
> > Yes, but in any case it probably won't be backported to already running
> > versions of FreeBSD, in which case the faulty behavior (pretty big deal
> > IMO) will still exist there.
> 
> Sure, but I meant internal to quagga.  Essentially similar to your
> workaround, but the drop would be done first, and only if the group
> was previously joined and had a failed drop.

In fact, I did patch the ospfd code in February to set a flag in user space to
try to track the state of join/leave.  The relevant code is in
ospfd/ospf_interface.c:ospf_if_set_multicast.  At the time, I was debating
whether a 3-way state was required: member, not member, or uncertain.  But I
ended up with just a single flag bit (member or not member).  The key issue
here is how to interpret the state if the drop fails.  At the moment, the code
assumes that we are not a member, even if the drop fails.  This seemed like
the conservative approach.  Whereas the join must succeed in order for
us to believe that we are a member.  If we had a 3rd state (uncertain),
then we could set to uncertain if the drop failed, and add logic
to leave before joining if the current state is uncertain...

It's also not clear to me why the kernel requires a drop and
then an add.  If we get errno of EADDRINUSE, is that saying we're
already a member and we can simply set the return code to
indicate success?

Looking at the linux 2.6 kernel source code in
net/ipv4/igmp.c:ip_mc_join_group, it seems to me that the system call
to join should succeed if we are already a member (it just increments
a reference counter when the interface is specified by the ifindex).  
What's the story in FreeBSD?

I guess the question is whether the patch belongs inside ospf_network.c
in the functions that join or leave, or whether some patch is needed
to the higher-level logic in ospf_if_set_multicast.  And I guess
there's another question whether the lib/sockopt.c:setsockopt_multicast_ipv4
code is working optimally.  On linux, it uses the ifindex instead of the
address.  For non-linux, it looks like the behavior depends on whether
HAVE_BSD_STRUCT_IP_MREQ_HACK is set (if yes, then the ifindex is used).
The logic in configure.ac looks like this:

AC_MSG_CHECKING([for BSD struct ip_mreq hack])
AC_TRY_COMPILE([#ifdef HAVE_SYS_PARAM_H
#include <sys/param.h>
#endif],[#if (defined(__FreeBSD__) && (__FreeBSD_version >= 500022 || (__FreeBSD_version < 500000 && __FreeBSD_version >= 440000))) || (defined(__NetBSD__) && defined(__NetBSD_Version__) && __NetBSD_Version__ >= 106010000)
  return (0);
#else
  #error No support for BSD struct ip_mreq hack detected
#endif],[AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BSD_STRUCT_IP_MREQ_HACK,,[Can pass ifindex in struct ip_mreq])],
AC_MSG_RESULT(no))

Is it clear whether the ifindex is being used on the platforms
exhibiting the problem?

Regards,
Andy



More information about the Quagga-dev mailing list