[quagga-users 9950] OSPF route flip-flap after subnet extension

Nick King mr_king at optilian.com
Mon Sep 29 15:25:34 IST 2008


Hi all,

--- Introduction and problem description ---

I've an OSPF routed network with several areas, following some changes
in the logical network topology (a /30 became a /29), I've a problem
that I'm unable to solve.

NOTE: Reals IPs addresses have been changed to fake ones since this is a
production network.

Here is a small schema which could help to get the picture more easily
[area 0] 192.168.250.0/26 <--- router-id 192.168.250.37 ---> [area
192.168.250.200] 192.168.250.200/29

The problem is that the announce of area 192.168.250.200 (now
192.168.250.200/29 but previously 192.168.250.200/30) on the area 0 is
inconsistent. Basically it's "Link ID" and "Route" change from time to
time and cause some sort of network flip/flap from machines on the
backbone (area 0) to this network.

Last but not least, I'm using Quagga 0.99.5 (debian Etch version) on all
members routers of area 192.168.250.200. In area 0, there is a mix of
0.98.3 and 0.99.5 but since this is not where the problem is coming
from, I think we can concentrate on the latest release.

Here are the interesting bit and pieces of the config

--- zebra # show ip ospf interface eth1
eth1 is up
  ifindex 5, MTU 1500 bytes, BW 0 Kbit <UP,BROADCAST,RUNNING,MULTICAST>
  Internet Address 192.168.250.201/29, Broadcast 192.168.250.207, Area
192.168.250.200
  MTU mismatch detection:enabled
  Router ID 192.168.250.37, Network Type BROADCAST, Cost: 10
  Transmit Delay is 1 sec, State DROther, Priority 1
  Designated Router (ID) 192.168.250.204, Interface Address 192.168.250.204
  Backup Designated Router (ID) 192.168.250.202, Interface Address
192.168.250.202
  Multicast group memberships: OSPFAllRouters
  Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
    Hello due in 4.074s
  Neighbor Count is 2, Adjacent neighbor count is 2

--- ospfd.conf

router ospf
ospf router-id 192.168.250.37
redistribute connected route-map OPTLN
redistribute static route-map CERT
passive-interface eth4
passive-interface ppp0
network 192.168.250.0/26 area 0.0.0.0
network 192.168.250.136/29 area 192.168.250.136
network 192.168.250.144/28 area 192.168.250.144
network 192.168.250.160/29 area 192.168.250.160
network 192.168.250.200/29 area 192.168.250.200
!
access-list OPTLN permit 192.168.250.0/23
access-list OPTLN deny any
!
route-map OPTLN permit 5
match ip address OPTLN
!

--- interesting info from "show ip ospf database self-originate"

       OSPF Router with ID (192.168.250.37)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age  Seq#       CkSum  Link count
192.168.250.37   192.168.250.37     68 0x80001ab6 0x14e6 1

                Summary Link States (Area 0.0.0.0)

Link ID         ADV Router      Age  Seq#       CkSum  Route

192.168.250.136  192.168.250.37   1154 0x8000196b 0x839f 192.168.250.136/29
192.168.250.144  192.168.250.37    224 0x8000196b 0x0320 192.168.250.144/28
192.168.250.160  192.168.250.37   1524 0x80001a94 0x3da3 192.168.250.160/29
192.168.250.207  192.168.250.37    124 0x80000e8c 0x992c 192.168.250.200/29

                Summary Link States (Area 192.168.250.136)

Link ID         ADV Router      Age  Seq#       CkSum  Route
192.168.250.144  192.168.250.37   1116 0x80000021 0xe2a3 192.168.250.144/28
192.168.250.160  192.168.250.37   1266 0x80000021 0x72fb 192.168.250.160/29
192.168.250.200  192.168.250.37   1106 0x80000020 0xe264 192.168.250.200/29

                Summary Link States (Area 192.168.250.160)

Link ID         ADV Router      Age  Seq#       CkSum  Route
192.168.250.136  192.168.250.37    403 0x80000081 0xa283 192.168.250.136/29
192.168.250.144  192.168.250.37   1213 0x80000022 0xe0a4 192.168.250.144/28
192.168.250.200  192.168.250.37   1533 0x80000021 0xe065 192.168.250.200/29

--- Comments on last command
Within area 0, Link ID is 192.168.250.207 with a route of
192.168.250.200/29, notice the incorrect Link ID "192.168.250.207"
Within others areas, Link ID is 192.168.250.200 with a route of
192.168.250.200/29 which is correct

Sometime the incorrect announce on area 0 has others values like:
192.168.250.200  192.168.250.37   3600 0x80001fc2 0x5829 192.168.250.200/30
or even a correct one:
192.168.250.200  192.168.250.37    233 0x80000da0 0x743e 192.168.250.200/29
but they don't stay long and the incorrect one shown in the first
example come back:
192.168.250.207  192.168.250.37    124 0x80000e8c 0x992c 192.168.250.200/29

What I can see very very often in OSPF logs (with minimum debug) of all
area 0 routers  is the following lines:
2008/09/25 16:33:39 OSPF: Link State
Update[Type3,id(192.168.250.200),ar(192.168.250.37)]: LS age is equal to
MaxAge.
2008/09/25 16:38:16 OSPF: Link State
Update[Type3,id(192.168.250.207),ar(192.168.250.37)]: LS age is equal to
MaxAge.

All this cause the route to the 192.168.250.200/29 network to be removed
for a short amount of time for all area 0 routers... and just in this
area, for the others everything works fine.

--- Self Analysis and Conclusion ---

For some reasons, there seems to be several overlaping information about
the 192.168.250.200/29 network in ospf router 192.168.250.37
I've tried to remove the "network 192.168.250.200/29 area
192.168.250.200" from all routers of this area and put it back, I've
tried to disable the interface in zebra and put it back... but nothing
has changed, I still have my problem.

I would be glad if someone could help me to find out what's wrong, maybe
there is some bug around but my OSPF skills aren't high enough to affirm
this.

Best regards, Nick.






More information about the Quagga-users mailing list