[quagga-dev 3481] Re: [PATCH] non-blocking I/O from client daemonstozebra

Paul Jakma paul at clubi.ie
Fri Jun 3 08:12:08 BST 2005

Hi Simon,

On Fri, 3 Jun 2005, Simon Talbot wrote:

> Bad news Paul -- During our maintenance window this evening, we had 
> to do a cold re-start of the routers upon which I am running the 
> work-queue version patch etc. During start-up (remember two transit 
> router + 81 peers to come up) I ended up with a Kernel Pannic (IRQ 
> Not synching etc.). I only had a very limited amount of time to 
> investigate this, but did re-produce reliably by cold starting 
> quagga, not every time, but about 70% of the time. I also recreated 
> it on two routers. Both routers have now been rolled back to 0.99.0 
> and are stable through re-boots etc.

Hmm, that is bad news.

However, that sounds like a kernel or hardware problem.

> Also noticed that with the work queue patch, the routers were 
> considerably slower at bringing all sessions up (when they did not 
> kernel panic)


> I am sorry, in diagnostic terms I have very little for you as I 
> could not get the details of the kernel panic (I was working 
> through a 2 line by 16 character LCD -- No laptop with me, was 
> meant to be an easy changeover of a couple of cards !)


> It is going to be very hard for me to do further testing on this, 
> as it causes route flap with our peers each time I carry out a 
> test, and they start to get a little prickly !


Having additional test boxes really helps. Also, one of the 
advantages of Quagga is that you should have saved enough money to be 
able to install a second router. With the trend towards ethernet 
interconnects that can be more easily shared without fancy hardware 
(compared to STM/Ex)..

> I am guessing, but probably under extremely heavy load bgpd is 
> probably running the kernel out of buffers/memory and hence the 
> panic, or the queues are going haywire.

You shouldn't get a panic. The queueing bgpd will use memory 
differently - more of it and more intensively, but it shouldnt cause 
problems as long as there is enough memory. Very much sounds like 
kernel or hardware problem.

> Sorry I can't be too much more help -- was a bad night !

Sorry to hear that :(

> Simon

Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
The secret of healthy hitchhiking is to eat junk food.

