[quagga-users 12917] Watchquagga: unexpected restart

Sim simvirus at gmail.com
Thu Jun 14 16:40:07 BST 2012


Hello!

Today my watchquagga has restarted zebra+quagga by mistake ;-)

This is the normal check (every 1 min.):

2012/06/14 10:30:10 debugging: BGP: Performing BGP general scanning
2012/06/14 10:30:11 debugging: BGP: scanning IPv4 Unicast routing tables


This is the output with the problem.
The "Performing" and "scanning" are very distant.
Probably for a global update or server cpu too load...


Jun 14 20:34:21 bgp watchquagga[2411]: bgpd state -> unresponsive : no
response yet to ping sent 10 seconds ago
Jun 14 20:34:21 bgp watchquagga[2411]: Forked background command [pid
12183]: /usr/bin/service quagga restart
Jun 14 20:34:41 bgp watchquagga[2411]: Warning: restart all child
process 12183 still running after 20 seconds, sending signal 15
Jun 14 20:34:41 bgp watchquagga[2411]: restart all process 12183
terminated due to signal 15
Jun 14 20:35:12 bgp watchquagga[2411]: bgpd state -> down : unexpected
read error: Connection reset by peer
Jun 14 20:35:42 bgp watchquagga[2411]: Forked background command [pid
12580]: /usr/bin/service quagga restart
Jun 14 20:35:57 bgp watchquagga[2411]: zebra state -> unresponsive :
no response yet to ping sent 10 seconds ago
Jun 14 20:36:02 bgp watchquagga[2411]: Warning: restart all child
process 12580 still running after 20 seconds, sending signal 15
Jun 14 20:36:02 bgp watchquagga[2411]: restart all process 12580
terminated due to signal 15
Jun 14 20:36:11 bgp watchquagga[2411]: zebra state -> down :
unexpected read error: Connection reset by peer
Jun 14 20:38:06 bgp watchquagga[2411]: Forked background command [pid
12624]: /usr/bin/service quagga restart
Jun 14 20:38:11 bgp watchquagga[2411]: zebra state -> up : connect succeeded
Jun 14 20:38:12 bgp watchquagga[2411]: bgpd state -> up : connect succeeded


2012/06/14 20:34:09 debugging: BGP: 1.2.3.4 send UPDATE 8.25.195.0/24
2012/06/14 20:34:09 debugging: BGP: 1.2.3.4 send UPDATE 8.25.197.0/24
2012/06/14 20:34:09 debugging: BGP: 1.2.3.4 send UPDATE 8.25.194.0/24
2012/06/14 20:34:09 debugging: BGP: 1.2.3.4 send UPDATE 109.161.64.0/19
2012/06/14 20:34:09 debugging: BGP: 1.2.3.4 send UPDATE 117.218.240.0/20
2012/06/14 20:34:09 debugging: BGP: 1.2.3.4 send UPDATE 186.190.184.0/24
2012/06/14 20:34:10 debugging: BGP: Performing BGP general scanning
2012/06/14 20:34:39 debugging: BGP: scanning IPv4 Unicast routing tables
2012/06/14 20:34:40 warnings: BGP: SLOW THREAD: task bgp_scan_timer
(d3dc1e) ran for 29828ms (cpu time 2056ms)
2012/06/14 20:34:40 notifications: BGP: Terminating on signal
2012/06/14 20:34:40 informational: BGP: Notification sent to neighbor
1.2.3.4: type 6/3
2012/06/14 20:38:07 notifications: BGP: BGPd 0.99.20 starting:
vty at 2605, bgp@<all>:179


Is there a way to increase timeout?

This is my current config:

/usr/sbin/watchquagga -dz -R /usr/bin/service quagga restart zebra bgpd


Which is the correct parameter for not to fall again?

-m, --min-restart-interval
                Set the minimum seconds to wait between invocations of daemon
                restart commands (default is 60).
-M, --max-restart-interval
                Set the maximum seconds to wait between invocations of daemon
                restart commands (default is 600).
-i, --interval  Set the status polling interval in seconds (default is 5)
-t, --timeout   Set the unresponsiveness timeout in seconds (default is 10)
-T, --restart-timeout
                Set the restart (kill) timeout in seconds (default is 20).
                If any background jobs are still running after this much

Thanks!


More information about the Quagga-users mailing list