[quagga-dev 5309] Re: intermittent communication between bgpd and zebra?

Andrew J. Schorr aschorr at telemetry-investments.com
Fri Apr 18 21:49:21 BST 2008


Hi Ray,

On Fri, Apr 18, 2008 at 04:24:40PM -0400, Ray Barnes wrote:
> Thanks Andy.  The issue is that bgpd is simply not aggressive enough in its
> communication attempts toward zebra.  If it made an attempt once every 2-3
> seconds, that'd surely satisfy my requirement for HA/failover using BGP.
> But even with 'watchquagga' which I understand to simply restart quagga
> daemons if they fail, that solution is not adequate.  Per my previous
> message, if zebra is restarted for some reason, it will not receive updates
> from bgpd until bgpd has something to update.  If bgpd never makes an
> update, zebra will not receive routes.  I've seen this condition persist
> over and over again, for several hours at a time in my environment.

Frankly, I don't see in the code why bgpd would connect to zebra even
if it had an update.  It looks to me like this connection is attempted
only by the bgp_init() function, and bgp_init is called once in main
before entering the main event processing loop.  So there's probably
something in the code that I'm missing if the behavior is as you say.

I'm not a heavy bgpd user, so I'm not 100% clear on the desirability
of having bgpd attempt to connect to zebra every few seconds.  The patch
to do this would not be difficult, but I don't know that this would always
be desirable behavior for all users.

Two thoughts spring to mind:

   1. A command-line option could be added to turn on the behavior that you
   desire.

   2. An explicit command could be added to the telnet/vtysh interface that
   would allow you to instruct bgpd to try to connect to zebra.  That way,
   if you have some event that causes you to start up zebra, you could
   then simply tell bgpd to connect.

If you were to submit a patch for either of these approaches, that would
be the best way for you to get this functionality added.

> In fact, the thing that prompted all of this digging on my part, was that my
> box lost its default route in zebra.  Although bgpd had defaults from both
> peers, the route simply *fell out* of zebra and the kernel.  That's a bug
> which will definitely preclude my use of quagga as currently deployed, and
> maybe even overall.

It sounds like this issue requires some debugging.  Could you please open
a bugzilla item on this?

Regards,
Andy



More information about the Quagga-dev mailing list