[quagga-dev 642] Re: kernel routing socket handling problems

Greg Troxel gdt at ir.bbn.com
Wed Jan 7 21:29:16 GMT 2004

  kernel_socket.c:870: warning: declaration does not declare anything

indeed the line I added is wrong and has no variable name.  I'm
stunned that this isn't a syntax error, but I suppose it is an
allowable form of forward decl for one to later use a pointer to an
undefined structure.

  Adding a sockaddr_dl at line 870 is a hack- it is assuming that the
  kernel is going to always send a sockaddr_dl as the first address 
  following the ifa_msghdr. This assumption would quickly fail if the
  ifm_addrs contained some other flag whose value was less than RTA_IFP. 

The structures following the msghdrs are never actually referenced or
used in any way.  They are just there, as far as i can tell, to make
the buffer big enough.

I did notice that sockaddr_dl is 20 bytes on netbsd, but have recently
seen 24 byte sockaddr_dls returned (but all the actual data is in the
first 18 bytes).   Since there is one 24-byte sockaddr_dl and no other
sockaddrs, the message fits easily in 7 or 8 * 20 bytes.  But you are
right that the approach is unsound.

Is sockaddr_storage big enough to store sockaddr_dl on Solaris?  It
seems to be on NetBSD, and it's slightly unclear to me if the 'can
store address from any AF' is supposed to include sockaddrs that don't
really have an AF of a real PF.  If it's big enough, using
sockaddr_storage[RTAX_MAX] probably makes sense.  I'm also somewhat
inclined to rip out the whole hairy structure since all that routine
does is look at the type, but I think it's better to leave well enough
alone here, since part of the structure's point seems to be to avoid
casting to the various types (not that union overlays are any safer!).

  So why does BSD use a 2048 sized buffer to receive messages in /sbin/route?
  I think that bsd's /sbin/route uses a 2048 sized buffer because that's
  the size of the buffer from an M_EXT mbuf.

A good point, but I've modified PF_KEY sockets to support much larger
messages.  I think it uses 2048 because that is arguably big enough
for now (probably for the reason you note) and it was easy, not
because it's the only right way.  I'm a BSD fan, yes, but certainly
not everthing in the code base is correct or elegant.  Also the
/sbin/route code might predate sockaddr_storage, which I think arose
due to ISO and IPv6 addresses.

  On another note, more info on the interface index issue, and why Solaris
  uses a different index for each incarnation of an interface- 
  turns out that SNMP requires that the stats that the snmp daemons 
  maintain for an interface _must_always_increase.  In Solaris, an 
  'ifconfig <intf> unplumb; ifconfig <intf> plumb'
  pair will break this rule unless the interface index changes on 
  re-plumb.  This is probably not an issue for BSD where there's no 
  equivalent for the "unplumb" and "plumb" operations. 

That's interesting.  So the new interface, once replumbed, is actually
a _different interface_ from the Solaris and SNMP point of view.
So there is merit to zebra treating it as a new entity as well.  This
decision probably has significant ramifications in such a world.

I suppose BSD should perhaps also do this.  While there is no explicit
notion of plumbing, interfaces are destroyed when cardbus cards are
removed, etc., and things like gif can be 'create'd and 'destroy'd.
But, this rule could also be enforced by the SNMP daemon.

  interface index can
  sometimes be an unreliable identifier for identifying an interface
  within zebra, whereas interface names are likely more reliable. 

Well, I think this is a deep issue as to what is meant by an interface
- is it a new entity with the same name, or the same entity again?
The SNMP rules above say it is a new entity.

Does Solaris send messages on the routing socket announcing that
interfaces have been plumbed and unplumbed?  It would be cool for
zebra to be able to not only do 'down' processing (presumably that
notice comes first when you unplumb an up interface) but to tear down
the rest of the interface state when it is unplumbed.

  Having said that, I'll agree that lookup-by-index followed 
  by lookup-by-name is in itself not harmful (except that it's going
  to be a tad slower for Solaris in some cases), and let it slide
  even though I'm not convinced about the comments about ifindices being 
  the "primary handle" for interfaces.

OK, but I don't see it being much slower on balance - index lookup
should be much faster than searching for a name, and these messages
are fairly rare anyway.

More information about the Quagga-dev mailing list