[quagga-dev 9448] Re: [PATCH] zebra: fix bug #486 (lingering after IP address deletion)

Greg Troxel gdt at ir.bbn.com
Tue Jun 19 16:30:03 BST 2012


David Lamparter <equinox at opensourcerouting.org> writes:

> [More comments on http://marc.info/?l=quagga-dev&m=126504146815246&w=2
> btw.]

> The real problem is that the semantics of if_unset_prefix are not
> defined.  In reality, both the netlink version, as well as the ioctl
> version (the latter even more so!) are fully synchronous -- if they
> return success, the prefix is gone, as should be the route.  Handling
> this whole sequence of events synchronously works on all OSes and has an
> added benefit of simplicity.

It isn't really simpler, as the code has to work when other programs
make changes, and I think there's great merit in listening to the
routing socket/netlink to have zebra track kernel state as the sole
approach.

> However, I did indeed not rememer the secondary-address case and the
> patch may break that on BSD.  Can you test that?

I can test on NetBSD (not sure when I can get to it), but there are
other systems as well.

> It's not a regression in Linux btw, it's rather simply incorrect
> assumptions about the [ill-documented] netlink interface.  Linux always
> behaved this way and the bug has been open since 2008.  Zebra does

Or it's a bug in netlink that's always been there.   If we use the 'my
pid, must be an ack', then the pid should only be set in things that are
really acking what was done, not the things which happen as a consequence.

> actually get a notification of the deletions, but ignores it because
> zebra itself caused them -- it expects that the message is just a
> confirmation of something done elsewhere in the code, and that there's
> nothing more to be done there...
>
>> So I object to this change, at least for now.
>
> The "cheat" workaround for this is to wrap the chunk in an #ifdef
> HAVE_NETLINK, which, for the sake of getting this fixed sooner than
> later, might be viable to push first before fixing this "correctly".

Well, I think it's far better to wrap the change in HAVE_NETLINK,
because there's nothing wrong in the non-HAVE_NETLINK case, and it's
best not to perturb working systems.

My comment about this needing far more comments stands -- this is
complicated, and any code changes for complicated things should result
in comments in the source code (rather than the commit message) that
explain how things are supposed to work.

So if you want to make this HAVE_NETLINK conditional, I don't have any
objections (partly because I believe that you have analyzed it, and
partly because I don't understand the details well enough).

But I object to the patch as is (affecting systems other than Linux),
because I believe that the risk of breakage in cases not thought about
is high.

With your patch, what happens when you add a second address in the same
prefix, and then remove the first one?  Does the route remain?  This is
what worries me - there's an underdefined interface and zebra is
assuming what the OS will do in terms of consequences beyond what is
asked for, and things will get out of sync if that's wrong.

I tried adding two aliases in the same prefix (with ifconfig, not vtysh)
and removing the first one.  Curiously, the second address was not
reported.  But when I removed the first one I got a DELETE/DELADDR and
then a NEWADDR/ADD for the promoted address.

got message of size 80 on Tue Jun 19 11:23:21 2012
RTM_NEWADDR: address being added to iface: len 80, metric 0, flags: <CLONING>
sockaddrs: <NETMASK,IFP,IFA,BRD>
 255.255.255.0 wm1:68.5.ca.xx.xx.xx 10.254.0.1 10.254.0.255
got message of size 120 on Tue Jun 19 11:23:21 2012
RTM_ADD: Add Route: len 120, pid 0, seq 0, errno 0, flags: <UP,CLONING>
locks:  inits: 
sockaddrs: <DST,GATEWAY,NETMASK>
 10.254.0.0  255.255.255.0


got message of size 120 on Tue Jun 19 11:23:49 2012
RTM_DELETE: Delete Route: len 120, pid 0, seq 0, errno 0, flags: <CLONING>
locks:  inits: 
sockaddrs: <DST,GATEWAY,NETMASK>
 10.254.0.0  255.255.255.0
got message of size 80 on Tue Jun 19 11:23:49 2012
RTM_DELADDR: address being removed from iface: len 80, metric 0, flags: <UP,CLONING>
sockaddrs: <NETMASK,IFP,IFA,BRD>
 255.255.255.0 wm1:68.5.ca.xx.xx.xx 10.254.0.1 10.254.0.255
got message of size 80 on Tue Jun 19 11:23:49 2012
RTM_NEWADDR: address being added to iface: len 80, metric 0, flags: <CLONING>
sockaddrs: <NETMASK,IFP,IFA,BRD>
 255.255.255.0 wm1:68.5.ca.xx.xx.xx 10.254.0.2 10.254.0.255
got message of size 120 on Tue Jun 19 11:23:49 2012
RTM_ADD: Add Route: len 120, pid 0, seq 0, errno 0, flags: <UP,CLONING>
locks:  inits: 
sockaddrs: <DST,GATEWAY,NETMASK>
 10.254.0.0  255.255.255.0



It seems to be that the right fix is to process our own (same-pid)
netlink messages and not to make changes because we made a request.
Or to make the change on request (for the request, not also implied
other changes) and to process all messages but suppress the
not-found/dup errors.

But, if you want to make this ifdef NETLINK, I don't object.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL: <http://lists.quagga.net/pipermail/quagga-dev/attachments/20120619/1e50b41a/attachment-0001.sig>


More information about the Quagga-dev mailing list