[quagga-dev 7312] Re: isisd: stability and functionality patches
Rishi.Srivatsavai at Sun.COM
Wed Oct 28 23:13:45 GMT 2009
I was requested to review your patches. My comments in-line below.
On 04/13/09 08:31, Peter Szilagyi wrote:
> I have been playing around with the IS-IS implementation (isisd)
> included in quagga. In my experience, isisd crashes like hell when it
> comes to a link failure, and even the basic SPF calculation is buggy,
> not to mention other epic failures here and there. To improve the
> current situation, I created four patches targeting different
> stability and functionality problems. The patches are against the
> latest stable release, quagga 0.99.11. The description of the patches
> #1 Header file inclusion
> Inclusion of the source file hash.c instead of the proper hash.h header file.
> #2 Circuit state machine
> isisd has a so-called circuit state machine that takes care about
> the interface state changes, such as initializing, down, up. When
> an interface was brought down by a link failure, the interface
> information was deleted and set to NULL. When the link was
> restored later, the interface was looked up by the old pointer,
> but since it was cleared, it was never found again, resulting in
> an interface never entering the up state again. Also, the program
> regularly crashed because of a deleted pointer in the same
> context which was later accessed without any further
I don't follow why isis_delete_adj was replaced by listnode_delete
in isis_adjacency.c, our work on this issue is tracked by:
Rather than modify the send_* function calls we fixed the lack
of callers to isis_circuit_down.
> #3 Pseudo-node LSP (re)generation
> After an IS has been elected as the Designated IS for a LAN, it
> did not refresh the content of the pseudo-node after a new node
> has been connected to the same LAN. Instead, the periodically
> reoriginated pseudo-node LSP still contained only those IS
> neighbors that were already present when the DIS election process
> was commenced. The fix for the problem schedules an LSP
> regeneration rather than just reoriginating the same LSP with the
> old content.
I believe the fix should be in adjacency state handling code.
Moreover it only does level 1 call. The fix should be in
isis_adjacency.c (isis_adj_state_change function) and is
fixed in our tree:
> #4 Wrong next-hops from SPF
> The forwarding table was filled with wrong next-hops, and which
> is even worse, it was done in a totally non-deterministic
> way. The next-hop set for an IP prefix by isisd was the neighbor
> IS from which the flooded LSP about the IP prefix was
> arrived. So, if an IS received all the LSPs through its, say,
> eth0 interface, all entries in the forwarding table contained the
> next IS reachable via eth0 as the next-hop. The solution is to
> propagate the correct next-hop further from node to node as the
> SPF algorithm traverses the graph and selects the next node to be
> added to the set of already covered nodes.
> Also, the construction of the tentative node list (the nodes
> where the shortest path is not known yet) was buggy: if a node
> was already a member of this list with a certain path cost, and
> an alternative path was found to it with a lower cost while
> processing a pseudo-node LSP, it was not added to the list. This
> way, the path selected by isisd for a certain prefix was the
> first one it encountered during the LSDB processing.
Looks good. My only comment is the lsp->adj info can be useful and
it might be a good idea to maintain it. Adding more comments would
be appreciated as well.
BTW you can check out our hg repo here and we welcome any feedback
preferably at rbridges-dev at opensolaris.org before we submit our
More information about the Quagga-dev