[quagga-dev 5314] Re: intermittent communication between bgpd and zebra?

Ray Barnes rrb at colo4jax.com
Mon Apr 21 14:02:20 BST 2008

Just a followup after poking around with the code; comments inline:

On 4/18/08, Andrew J. Schorr <aschorr at telemetry-investments.com> wrote:
> Hi Ray,
> On Fri, Apr 18, 2008 at 04:24:40PM -0400, Ray Barnes wrote:
> > Thanks Andy.  The issue is that bgpd is simply not aggressive enough in
> its
> > communication attempts toward zebra.  If it made an attempt once every
> 2-3
> > seconds, that'd surely satisfy my requirement for HA/failover using BGP.
> > But even with 'watchquagga' which I understand to simply restart quagga
> > daemons if they fail, that solution is not adequate.  Per my previous
> > message, if zebra is restarted for some reason, it will not receive
> updates
> > from bgpd until bgpd has something to update.  If bgpd never makes an
> > update, zebra will not receive routes.  I've seen this condition persist
> > over and over again, for several hours at a time in my environment.
> Frankly, I don't see in the code why bgpd would connect to zebra even
> if it had an update.  It looks to me like this connection is attempted
> only by the bgp_init() function, and bgp_init is called once in main
> before entering the main event processing loop.  So there's probably
> something in the code that I'm missing if the behavior is as you say.

You're right - bgpd itself will not try reconnecting - this is handled by
lib/zclient.c.  For example, when something invokes
lib/zclient.c:zclient_send_message, this function checks connectivity and if
it cannot write to the zebra socket, it will 'return
zclient_failed(zclient);'.  But in practicality, the way this gets called,
i.e. bgpd:bgp_zebra.c:bgp_zebra_announce -> lib/zclient.c:zapi_ipv4_route ->
lib/zclient.c:zclient_send_message, does not report an error back to bgpd.
Even if it did, zclient is still responsible for establishing a new
connection into zebra when the existing one is interrupted.  The problem as
I see it, is that if the zebra connection is interrupted for any reason
(someone breaks iptables rules on the box, or zebra dies, etc) it does not
notify bgpd, and thus, bgpd will not resend all of its routes.

I attempted to resolve this by creating a new message type in zebra.h like
'#define ZEBRA_IDLE 99', adding the type to the zebra daemon as a throwaway
case, and writing new functions in bgp_zebra.c invoked by the bgp scanner in
bgp_nexthop.c to send the ZEBRA_IDLE to the zebra daemon every scan
interval, and if the idle fails and the idler function can detect that it
reconnected to zebra, rerun a modified version of bgp_init().  That works
fine but it doesn't resend the prefixes out of the bgpd RIB to zebra.  Not
having touched C in the last 5 years, I'm simply not experienced enough (and
lack the time) to rectify this, even with a stop-gap solution like the one
I've already attempted.  But hopefully I've shed enough light that
eventually, someone who already has a good handle on the internals of the
project can address this.  It seems to me that a major rewrite of the bgpd
interface to zebra is necessary so that it remains cognizant of zebra's
status at all time - both in terms of connectivity, and being able to lookup
zebra's routes to make sure zebra has everything in bgpd's RIB.

I'm not a heavy bgpd user, so I'm not 100% clear on the desirability
> of having bgpd attempt to connect to zebra every few seconds.  The patch
> to do this would not be difficult, but I don't know that this would always
> be desirable behavior for all users.

>From my perspective, it's very much desirable for all users.  In the status
quo as I've pointed out, a reload of both daemons would be required if
communication fails between bgpd and zebra for any reason.  From what I
understand, this is precisely one of the motivating factors for Cisco to
gravitate toward modular IOS.  Needless to say, reloading your entire router
because one process develops a problem is much less than ideal in a
production environment.

> > In fact, the thing that prompted all of this digging on my part, was
> that my
> > box lost its default route in zebra.  Although bgpd had defaults from
> both
> > peers, the route simply *fell out* of zebra and the kernel.  That's a
> bug
> > which will definitely preclude my use of quagga as currently deployed,
> and
> > maybe even overall.
> It sounds like this issue requires some debugging.  Could you please open
> a bugzilla item on this?

Unfortunately I won't be using quagga in the capacity I had originally
slated, so I won't be of much use pertaining to bug reports on this
particular issue.  All it'll be doing now is route announcement, as I can't
rely on it to populate my FIB with the gaping holes I've outlined herein.  I
can tell you, however, that the route didn't exactly "fall out" as I'd
previously stated.  I checked the screenshot I took at the time (I'll send
you a copy off-list if you like), and it merely says "incomplete" next to
the default route to my .17 route from bgpd (in reference to the previous
bgpd config I posted).  When I went into bgpd, the .17 route was valid and
best.  The only correlation I could come up with is that the time next to
the route in zebra, 15:something, was the same amount of time I had
established the peering session to my .129 router (whereas the peering
session to .17 had been up much longer).  So perhaps bgpd choked, lost its
connection into zebra and into one of my peers, then came back.  If true,
this would point to the same phenomenon I'd mentioned above, about the lack
of synchronicity between the two.  Hope that helps.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quagga.net/pipermail/quagga-dev/attachments/20080421/4e1e6701/attachment-0001.html>

More information about the Quagga-dev mailing list