[quagga-dev 7146] Re: ospfd leaves stale OSPF routes
Roman Hoog Antink
rha at open.ch
Thu Aug 13 15:59:59 BST 2009
A simple and nicer workaround would be increasing the send buffer size
in zebra/zserv.c:zebra_serv_un():1435 by means of setsockopt SO_SNDBUF.
I tested this successfully by increasing /proc/sys/net/core/wmem_default
(was 110592) to /proc/sys/net/core/wmem_max (=8388608).
Roman Hoog Antink schrieb:
> Hi there
> I have quagga 0.99.14 under linux 2.6.29, learning more than 500 routes
> from an OSPF peer. When sending SIGTERM to ospfd, 150 of the learned
> routes remain in zebra and are marked with a * in "show ip route ospf",
> meaning these are still active in the kernel.
> I think I found the reason, why ospfd fails to cleanup all learned
> routes in the kernel during shutdown. Let me guide you through the
> shutdown process (as far as I dived into it).
> The signal handler of SIGTERM finally leads to
> ospfd/ospfd.c:ospf_finish_final(). That function first terminates all
> timers (line 472) and then in line 525...
> ospf_route_delete (ospf->old_external_route);
> ...zebra is being told to delete all external OSPF routes.
> The communication to zebra is a nonblocking write over a unix socket. If
> ospfd has to delete many routes (e.g. more than 500), the socket buffer
> gets overrun, because zebra can't delete the routes as fast as they come
> in, and a timer in ospfd should retry later. You can find the timer
> being setup here: lib/zclient.c:262:zclient_send_message()
> case BUFFER_PENDING:
> THREAD_WRITE_ON(master, zclient->t_write,
> zclient_flush_data, zclient, zclient->sock);
> But as we are in the shutdown process already, that timer never gets its
> You can find a patch in the attachment, which superficially deals with
> the problem. But it is ugly, as it reverts all zebra communication to
> blocking writes. Furthermore it uses usleep to avoid excessive CPU usage
> during retries. That might lead to trouble, as usleep uses SIGALARM and
> might interfere with ospfd's own timers.
> In order to solve this problem nicely, a developer having the correct
> semantics of ospfd's timers in mind, should think about it. Maybe we
> should flush the buffers before exit somehow, whithout using a timer
> Roman Hoog Antink
More information about the Quagga-dev