[quagga-users 9207] Re: what's happening? was:(Re: quagga ospf
kirusz at freemail.hu
Fri Dec 7 15:58:55 GMT 2007
paul at clubi.ie wrote:
> On Wed, 5 Dec 2007, kiru wrote:
>>> eth1 is up
>>>> ifindex 4, MTU 1500 bytes, BW 0 Kbit <UP,BROADCAST,RUNNING,MULTICAST>
>>>> Internet Address 10.1.2.1/24, Broadcast 10.1.2.255, Area 10.1.0.0
>>>> MTU mismatch detection:enabled
>>>> Router ID 10.1.2.1, Network Type BROADCAST, Cost: 10
>>>> Transmit Delay is 1 sec, State DR, Priority 1
>>>> Designated Router (ID) 10.1.2.1, Interface Address 10.1.2.1
>>>> No backup designated router on this network
>>>> Multicast group memberships: OSPFAllRouters OSPFDesignatedRouters
>>>> Timer intervals configured, Hello 10s, Dead 40s, Wait 40s,
>>>> Retransmit 5
>>>> Hello due in 3.898s
>>>> Neighbor Count is 1, Adjacent neighbor count is 0
>> If it has Hello packet received in 3.898s why change 10.1.2.9's status
>> to init? This is why quagga lost ospf routes.
> The status gets changed, despite Hello having been received, as there is
> evidently no bi-directional connectivity - the received Hello does not
> list the router's own link IP, therefore the remote router isn't
> receiving /our/ Hellos.
> The Hello protocol is doing what it's supposed to do here - detect
> connectivity problems (in either direction).
>> 2007/12/05 16:17:22 OSPF: Packet[DD] [Slave]: Neighbor 10.1.2.9 packet
>> The "Neighbor 10.1.2.9 packet duplicated" messages continued until I
>> restart the computer.
> How long did they continue for? Presumably for less than the dead-time.
I tested it. If ospfd works good and I restart it, there is no problem.
If it has problem, after I restart it, sends the "packet duplicated" messages forever.
>> If I restart it with the "/etc/init.d/quagga restart" command, the
>> "packet duplicated" error continues. But if I restart the whole
>> computer, then everything is ok.
> Now that's very odd. Wouldn't this make you suspect an OS or hardware
No. I can ping the other host (10.1.2.9). There is a hardware error that include only the multicast packets?
I don't think so.
>> I tried with stop - start and it has the same result.
> So it really does require a full restart of the OS to stop the problem
> from occuring? This suggests the fault may lie in the kernel you're
> using (resource leak of some kind?).
>> No, it's ok. I changed the network card to exclude it from the
>> possible hardware issues, and I can ping the other computer when ospf
>> isn't working.
> And yet, ospfd is unable to send packets for some reason - and can not
> do so reliably until you completely reboot the computer. When you
> changed the network card, did you continue to use the same driver? What
> driver is it? (This is wireless, isn't it?)
A week ago. It's a realtek chipset ethernet card. The driver name is 8139too.
The previous card used the 8139too driver too. Maybe I will try it with another chipset card.
> - Is ospfd sending packets?
> - use truss/strace to see how ospfd is interacting with your OS, in
> particular, whether it is issuing 'sendmsg' system calls (and
> what's the return code from the OS?)
This is when ospfd is in trouble:
It doesn't send any packet.
And after I restarted the quagga suite but I don't reboot the computer (packet duplicated):
> - Does your OS think it is sending the packets to the hardware?
> - use tcpdump on the problem computer. If the previous
> tcpdump, which you sent to me, was taken on that computer, you
> already know the answer.
Yes, the previous tcpdump is from the 10.1.2.1 too.
> - Do the packets actually make it on to medium?
> - use tcpdump on another host, as "close" to the host as possible.
This is the 10.1.2.9's log:
I don't see any pakects from 2.1 too.
> I strongly suggest you look more closely into whether it is your OS
> and/or hardware that is causing the problem. Everything you have told us
> so far seems to suggest it is.
> E.g. if you swapped the network card for one which uses the same driver,
> try changing it for hardware with a different driver (in particular, I
> know you're using wireless, and there's quite a difference in the
> quality of wireless drivers and the firmware in the hardware for
> different brands - this seems to be true even across OSes).
> You've had this problem for quite a while now, and I can only re-iterate
> (again), given there is no evidence so far to suggest the problem lies
> anywhere but under ospfd, that it must be some kind of connectivity
> problem (apparently local to the system).
So I can see in all logs that ospfd doesn't send packets. It's maybe a hardware error?
Why can I ping then the other machine?
More information about the Quagga-users