[quagga-dev 3507] Re: IPv6 and netlink problems revisited

Paul Jakma paul at clubi.ie
Thu Jun 9 14:57:58 BST 2005


On Thu, 9 Jun 2005, Hasso Tepper wrote:

> if (rtm->rtm_protocol == RTPROT_ZEBRA && h->nlmsg_type == RTM_NEWROUTE)
>  return 0;
>
> If route is flagged as zebra route and it's new, we drop message. 
> But we can't drop RTM_DELROUTE messages that way. Route flagged as 
> zebra route can be deleted by zebra daemon, but by user as well. If 
> zebra daemon deleted it, we should do nothing - we know all details 
> already. But, if by user, we must pass it to the rib processing. 
> So, we need the way to find out what caused the delete. For IPv4 
> routes it turns out works this one:
>
> /* skip unsolicited messages originating from command socket */
> if (nl != &netlink_cmd && h->nlmsg_pid == netlink_cmd.snl.nl_pid)
>  {
>    if (IS_ZEBRA_DEBUG_KERNEL)
>      zlog_debug ("netlink_parse_info: %s packet comes from %s",
>                   nl->name, netlink_cmd.name);
>      continue;
>  }
>
> Note, that order of arguments is actually wrong in zlog_debug() 
> call ;P. Anyway, in this way we know that if we received message 
> via netlink-listen socket, but pid is pid of netlink_cmd socket, it 
> means that it was originated by zebra daemon and must be dropped 
> (btw, seq number is also the number we sent message with).
>
> But because of buggy kernel it works for IPv4 only, IPv6 route 
> messages have pid 0 (originated by kernel) and seq number also 0.

Oh dear.

> One side note, why it's fatal ... Two commands is enough to reproduce it:
>
> ipv6 route dead::/64 x:x:x::1 20
> ipv6 route dead::/64 x:x:x::2


> After second command we delete first route from kernel. We receive 
> RTM_DELROUTE message from kernel about route we just deleted and 
> pass it to the rib_delete_ipv6(). Prefix is looked up and found 
> that fib route exists (btw, why we don't check nexthop if it isn't 
> ZEBRA_ROUTE_CONNECT),

Good question.

> but it isn't the same one => unset FIB flag on all nexthops and 
> unset active flag.

Possible solution:

- dont remove it from the zebra RIB
- let the DELROUTE 'echo' remove it

Would have to be netlink specific though.

> Process rib -> the same route we already have in 
> kernel will be selected for FIB, but because it exists already in 
> kernel, adding fails. As result we have to dead::/64 in RIB, none 
> of them with FIB flag, but kernel has one of them in FIB => we are 
> already fucked up.


> RIB code contains many questionable code as well, but it isn't 
> fatal. We don't check return value of kernel_delete_ipv6() for 
> example in rib_uninstall_kernel() etc.

> Solution
> ********
>
> Fix kernel. There can't be better solution. If anyone has knowledge 
> and time and can come up with patch, it would be really welcome. If 
> no one comes up with patch, I will bug kernel developers in the 
> weekend, I will be away for next days. Therefore don't expect any 
> answers from me to this mail as well. I think that I made it quite 
> clear where problem is and why it is ;).

Can we detect this in the RIB?

We could do it this way:

rib_delete_ipv6 tests a flag (REMOVING)
 	if exists -> remove it
 	else -> set flag

Then, the initial rib_delete_ipv6 will just set the flag, when it's 
called because of the reflected DELROUTE it will actually remove it.

This will break if the user removes a zebra route outside of zebra i 
think. But hey.

Something along the lines of the (completely untested) attached 
patch.

regards,
-- 
Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
Fortune:
For every bloke who makes his mark, there's half a dozen waiting to rub it out.
 		-- Andy Capp
-------------- next part --------------
Index: rib.h
===================================================================
RCS file: /var/cvsroot/quagga/zebra/rib.h,v
retrieving revision 1.6
diff -u -p -r1.6 rib.h
--- rib.h	28 Apr 2005 17:35:14 -0000	1.6
+++ rib.h	9 Jun 2005 13:56:40 -0000
@@ -151,6 +151,7 @@ struct nexthop
 #define NEXTHOP_FLAG_ACTIVE     (1 << 0) /* This nexthop is alive. */
 #define NEXTHOP_FLAG_FIB        (1 << 1) /* FIB nexthop. */
 #define NEXTHOP_FLAG_RECURSIVE  (1 << 2) /* Recursive nexthop. */
+#define NEXTHOP_FLAG_REMOVING	(1 << 3) /* nexthop is being removed */
 
   /* Interface index. */
   unsigned int ifindex;
Index: zebra_rib.c
===================================================================
RCS file: /var/cvsroot/quagga/zebra/zebra_rib.c,v
retrieving revision 1.20
diff -u -p -r1.20 zebra_rib.c
--- zebra_rib.c	28 Apr 2005 17:35:14 -0000	1.20
+++ zebra_rib.c	9 Jun 2005 13:56:40 -0000
@@ -1956,11 +1956,37 @@ rib_delete_ipv6 (int type, int flags, st
 	}
     }
 
-  /* Process changes. */
-  rib_queue_add (&zebrad, rn, same);
 
   if (same)
-    rib_delnode (rn, same);
+    {
+#ifdef HAVE_NETLINK
+      /* we have to deal with an IPv6 netlink bug Linux doesn't set the
+       * netlink pid correctly, so when we delete a route we are not able to
+       * filter out the reflected DELROUTE which we will receive on the
+       * netlink-listen socket.
+       *
+       * To deal with this, we /dont/ delete the zebra route initially,
+       * instead we set a REMOVING flag, remove from kernel and then let the
+       * receipt of the kernel's echo of the DELROUTE get us here again,
+       * where we remove the route for real.
+       *
+       * Once Linux is fixed for IPv6 netlink, this hack should be removed.
+       */
+      if ( !CHECK_FLAG (rib->nexthop->flags, NEXTHOP_FLAG_REMOVING) )
+        SET_FLAG (rib->nexthop->flags, NEXTHOP_FLAG_REMOVING);
+      else
+#endif /* HAVE_NETLINK */
+        {
+          rib_delnode (rn, same);
+          /* Process changes. */
+          rib_queue_add (&zebrad, rn, same);
+        }
+    }
+  else
+    {
+      /* Process changes. */
+      rib_queue_add (&zebrad, rn, same);
+    }
   
   route_unlock_node (rn);
   return 0;


More information about the Quagga-dev mailing list