[quagga-users 14482] Re: Issues with Routes in FreeBSD / PfSense New to Release 1.0

Sun Oct 30 07:08:44 GMT 2016

So here are the responses from Pfsense and another user in the forum:



I looked at the changelog too, and didn't see anything that would fix this.  The main problem is that when Quagga restarts, it doesn't recognize the routes that it previously put in there, so it pulls them in as "kernel" routes and they will always take precedence.  That's why it works fine until Quagga is restarted (which is basically kill & start, there is no graceful restart in Quagga).  Since the rib_sweep_table() function isn't used anymore, when it starts up it doesn't remove routes from the list of kernel routes that it previously put there (which it flags as RTF_PROTO1, or "1" in netstat -r).  I don't see how they aren't having more issues with this, unless the common scenario is that Quagga never gets restarted unless the whole OS is restarted.

I don't see why kill -9 matters here, because it worked fine before v1.0, and there is no graceful restart capability in Quagga.  Ideally pfSense could use the Quagga VTY to make changes live without restarting, and then write changes to the config files for the next time it starts up, but I doubt anyone wants to take on a project like that.

If you want more details let me know, but it would probably make more sense to discuss on the Quagga list instead of here.


Jimp from pfsense team:

That sounds like the issue. Preventing it from restarting is a hackish workaround no matter what signal is used. It will get restarted at some point and failing to recover gracefully is a regression in quagga's behavior in 1.x.

It needs to recognize the flags it sets on routes in the table, and it isn't. Hopefully someone at Quagga can pick up and run with that on their list.

Can anybody comment ? Since this bug has been in the code for 8 months now ... or more ...

So as per Martin, he thinks what is triggering the issue is the use of -9 to terminate quagga process in pfsense rc scripts. I did submit the debug logs to Martin... not sure if you need more. And no, I have no tested the routers yet while eliminating -9 from pfsense scripts.

So as per Martin, he thinks what is triggering the issue is the use of -9 to terminate quagga process in pfsense rc scripts. I did submit the debug logs to Martin... not sure if you need more. And no, I have no tested the routers yet while eliminating -9 from pfsense scripts.

Martin: Please see below response from a person in pfsense forum:

I see Martin's reply to you on Oct. 10, but I don't see anything after that.  Are you emailing him off-list?

I was looking through the Quagga code last night, and found something that I'm wondering whether or not could be the problem.  Quagga (zebra daemon) puts routes into the kernel with flag "1" (RTF_PROTO1, see netstat man page).  When zebra starts up it's supposed to ignore (filter out) any kernel routes with flag "1" because it should assume it put those there to begin with.  I think before Quagga version 1 this was working, and in version >= 1 it pulls in those kernel routes into the zebra RIB.

If I reboot a firewall and go to OSPF -> Status -> Zebra routes, I see a bunch of OSPF routes but barely any K (kernel) routes.  If I make any change on the Global Settings or Interface Settings tab quagga restarts, and then when looking at the zebra routes it is filled with kernel routes (one for each OSPF route).

Can you ask Martin to look at this:
Commit: https://github.com/Quagga/quagga/commit/0d0686f98e64017415071e590bde262f0ab5a4c9

File: zebra/zebra_rib.c
Function: rib_sweep_table

This function is commented out starting in version 1, but it was used in version 0.99.24.  There is a block of code in it:

Code: [Select]

if (rib->type == ZEBRA_ROUTE_KERNEL &&
    ret = rib_uninstall_kernel (rn, rib);
    if (! ret)
        rib_delnode (rn, rib);

The rib_weed_tables function that is still being used doesn't seem to do this same thing, from what I can tell.  This URL shows them side-by-side: https://fossies.org/diffs/quagga/

If you can point me to the thread where you are discussing this with Martin, I can pass this along to him if you prefer.

Just seeing this now...

On 3 Oct 2016, at 18:13, Reqlez Guy wrote:

https://forum.pfsense.org/index.php?topic=111108.0
> When triggering failover ... the failover link does not work with
> version 1  .... reverting back to .99 no problems. Pfsense Team seems
> to think it's something regarding Zebra restart... Several users have
> confirmed this issue. See thread for further info.

What is the "normal" version of Quagga in PfSense? (i.e. output of
"zebra -v")
Is this 1.0.20160315 ?

Can you post the output of a "zebra -v" from it (it should give some
compile options as well)
And what is the base OS? FreeBSD 10?

We are just about trying to get a new version out. If you are able to
compile your own
version, then it might be worthwhile to download and build from the
latest git master.

- Martin Winter

