[quagga-users 9207] Re: what's happening? was:(Re: quagga ospf trouble)

kiru kirusz at freemail.hu
Fri Dec 7 15:58:55 GMT 2007


paul at clubi.ie wrote:
> On Wed, 5 Dec 2007, kiru wrote:
> 
>>> eth1 is up
>>>>  ifindex 4, MTU 1500 bytes, BW 0 Kbit <UP,BROADCAST,RUNNING,MULTICAST>
>>>>  Internet Address 10.1.2.1/24, Broadcast 10.1.2.255, Area 10.1.0.0
>>>>  MTU mismatch detection:enabled
>>>>  Router ID 10.1.2.1, Network Type BROADCAST, Cost: 10
>>>>  Transmit Delay is 1 sec, State DR, Priority 1
>>>>  Designated Router (ID) 10.1.2.1, Interface Address 10.1.2.1
>>>>  No backup designated router on this network
>>>>  Multicast group memberships: OSPFAllRouters OSPFDesignatedRouters
>>>>  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, 
>>>> Retransmit 5
>>>>    Hello due in 3.898s
>>>>  Neighbor Count is 1, Adjacent neighbor count is 0
> 
>> If it has Hello packet received in 3.898s why change 10.1.2.9's status 
>> to init? This is why quagga lost ospf routes.
> 
> The status gets changed, despite Hello having been received, as there is 
> evidently no bi-directional connectivity - the received Hello does not 
> list the router's own link IP, therefore the remote router isn't 
> receiving /our/ Hellos.
> 
> The Hello protocol is doing what it's supposed to do here - detect 
> connectivity problems (in either direction).
> 
>> 2007/12/05 16:17:22 OSPF: Packet[DD] [Slave]: Neighbor 10.1.2.9 packet 
>> duplicated.
> 
>> The "Neighbor 10.1.2.9 packet duplicated" messages continued until I 
>> restart the computer.
> 
> How long did they continue for? Presumably for less than the dead-time.

I tested it. If ospfd works good and I restart it, there is no problem.
If it has problem, after I restart it, sends the "packet duplicated" messages forever.


>> If I restart it with the "/etc/init.d/quagga restart" command, the 
>> "packet duplicated" error continues. But if I restart the whole 
>> computer, then everything is ok.
> 
> Now that's very odd. Wouldn't this make you suspect an OS or hardware 
> problem?

No. I can ping the other host (10.1.2.9). There is a hardware error that include only the multicast packets?
I don't think so.

>> I tried with stop - start and it has the same result.
> 
> So it really does require a full restart of the OS to stop the problem 
> from occuring? This suggests the fault may lie in the kernel you're 
> using (resource leak of some kind?).
> 
>> No, it's ok. I changed the network card to exclude it from the 
>> possible hardware issues, and I can ping the other computer when ospf 
>> isn't working.
> 
> And yet, ospfd is unable to send packets for some reason - and can not 
> do so reliably until you completely reboot the computer. When you 
> changed the network card, did you continue to use the same driver? What 
> driver is it? (This is wireless, isn't it?)

A week ago. It's a realtek chipset ethernet card. The driver name is 8139too.
The previous card used the 8139too driver too. Maybe I will try it with another chipset card.

> - Is ospfd sending packets?
> 
>   - use truss/strace to see how ospfd is interacting with your OS, in
>     particular, whether it is issuing 'sendmsg' system calls (and
>     what's the return code from the OS?)

This is when ospfd is in trouble:
http://kiru.mikroweb.hu/ospfd-strace.log
It doesn't send any packet.

And after I restarted the quagga suite but I don't reboot the computer (packet duplicated):
http://kiru.mikroweb.hu/ospfd-strace2.log

> - Does your OS think it is sending the packets to the hardware?
> 
>   - use tcpdump on the problem computer. If the previous
>     tcpdump, which you sent to me, was taken on that computer, you
>     already know the answer.

Yes, the previous tcpdump is from the 10.1.2.1 too.

> - Do the packets actually make it on to medium?
> 
>   - use tcpdump on another host, as "close" to the host as possible.

This is the 10.1.2.9's log:
http://kiru.mikroweb.hu/mikrotik.txt

I don't see any pakects from 2.1 too.

> I strongly suggest you look more closely into whether it is your OS 
> and/or hardware that is causing the problem. Everything you have told us 
> so far seems to suggest it is.
> 
> E.g. if you swapped the network card for one which uses the same driver, 
> try changing it for hardware with a different driver (in particular, I 
> know you're using wireless, and there's quite a difference in the 
> quality of wireless drivers and the firmware in the hardware for 
> different brands - this seems to be true even across OSes).
> 
> You've had this problem for quite a while now, and I can only re-iterate 
> (again), given there is no evidence so far to suggest the problem lies 
> anywhere but under ospfd, that it must be some kind of connectivity 
> problem (apparently local to the system).
> 
> regards,

So I can see in all logs that ospfd doesn't send packets. It's maybe a hardware error?
Why can I ping then the other machine?


regards,
kiru


More information about the Quagga-users mailing list