[quagga-dev 3483] Re: [PATCH] non-blocking I/O from client daemonstozebra

Paul Jakma paul at clubi.ie
Fri Jun 3 10:01:04 BST 2005


On Fri, 3 Jun 2005, Simon Talbot wrote:

> The routers in question are already a cluster, two identical 
> hardware units using Linux-HA techniques for failover.

Phew :)

> With the work queues installed, both behaved in exactly the same 
> way and kernel panicked, once rolled back to a know good version 
> they were fine. They are routers of which we have about 20 
> throughout the business, all running the exact same kernel build 
> (our own custom wrapping of 2.4.22) and are (touch wood) extremely 
> stable -- I cannot remember the last time any of them ever 
> re-booted or kernel panicked. To make two of them, both behave in 
> the same way (kernel panic) with a software version change and then 
> behave properly again when the version id rolled back would 
> definitely lead me to the software as the root cause -- even if it 
> is actually memory starvation which is causing the panic.

Highly odd. Even excessive memory usage by an application still 
shouldn't cause a panic.

> The routers are both 512MB Ram with no swap disk -- well they have no
> disk at all -- so when they run out of RAM, they really run out !

Hehe. Still shouldn't cause a panic though.

> The following is the memory stats of one of the routers, when stable
> running 0.99.0
>
>             total       used       free     shared    buffers
> cached
> Mem:        506296     255988     250308          0        336
> 84372
> -/+ buffers/cache:     171280     335016
> Swap:            0          0          0
>
> I have more of these routers, but I think the fault is load 
> related, which makes it very tough to re-create outside of the 
> production environment due to the large number of peers etc.

Hmm, indeed.

> I will put my brain to possible wait to test this, but 
> unfortunately logging is carried out to RAM disk, so once the 
> routers kernel panic and then re-boot (by the watchdog) the 
> evidence has gone !

I heartily reccomend syslog servers.

> If you have any thoughts as to how/what I can do to diagnose this 
> further, please let mw know,

The details of the panic could be interesting.

regards,
-- 
Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
Fortune:
Law of the Jungle:
 	He who hesitates is lunch.



More information about the Quagga-dev mailing list