[quagga-dev 12073] Re: Advise on implementation

David Lamparter equinox at opensourcerouting.org
Mon Mar 2 07:18:05 GMT 2015

On Wed, Feb 18, 2015 at 12:58:22PM +0100, Olivier Dugeon wrote:
> In complement to our TE works already submit, we would implement the BGP 
> Link State extension (see 
> https://datatracker.ietf.org/doc/draft-ietf-idr-ls-distribution). For 
> that purpose, we need inter-process communication with OSPFd and ISISd 
> process.
> Same needs are also necessary to implement Path Computation 
> Element (PCE - RFC 4655). The primary goal is to exchange Database 
> contains, in particular OSPF LSA and IS-IS LSP including TE information. 

*sigh*  this has been coming for a long time - the IPC protocol between
zebra and the daemons needed to be extended (or even overhauled?) for a
long time.

Let me try to pull together a list of things that can't be done with the
current ZAPI socket protocol:
- LS distribution & PCE
- BFD peer status signalling & automatic session creation
- exchanging MPLS labels
- exchanging VPN route information (both intra- and inter-VRF)
- matching on route properties from another daemon when redistributing
- ... probably even more stuff I forgot

Some of these can probably be added into the existing protocol, but in
general what we have now can be described as anything but extensible.

I'm not saying you need to support all of these - I'm saying we need to
address extensibility.

> 1/ Extend Zebra protocol. Vincent Jardin already point me that it is not 
> a good option as the Zebra protocol, and Zebra daemon are heavy 
> solicited for VPN and adding more traffic will have a bad effect on 
> performance. But, as it will used in a particular case, perhaps it is 
> not an issue.
> 2/ Move OSPF and ISIS database from user space to Shared Memory space. 
> Such architecture let others process / thread access to the database in 
> read_only mode, but what will be the impact in term of performance, 
> especially with large database ? In addition, it not gives the 
> possibility to send some commands to other process like the OSPF_API do.
> 3/ Implement a dedicated bus/protocol similar to the Zebra one using 
> socket. Part of code could be reuse (coming from Zebra and OSPF_API), 
> but, like Zebra protocol, it uses intensively data copy in memory (at 
> least 4 to transfer a message to one process). Again, with large 
> database, there could be some issue with performance.
> 4/ Implement a dedicated bus using Shared Memory and Semaphore/Mutex to 
> access the bus managing read/write mode. This option reduce the number 
> of time we copy data in memory (copy once, read multiple) but introduce 
> more complexity as we need to synchronise thread and process which could 
> be hard to debug. The objective is to add a dedicated thread per daemon 
> to manage the bus which will not disturb other thread in case of lock. 
> If it is powerful and provide good performance, it could be a candidate 
> to replace the Zebra communication based on socket to improve performance.

There are 2 independent questions here:
- should this be a separate communication channel or should it be
  integrated with zebra communications?
- what transport medium should this use, shm or socket?

Your options match up mostly (though not exactly):
1) = "integrated, socket"
2) = "separate,   shm"
3) = "separate,   socket"
4) = "integrated, shm"

I don't have a well-founded opinion on what to do (yet), though I'd like
to make the following arguments:
- shm is not neccessarily *noticably* faster than sockets.  Sure it
  saves some copying and kernel calls, but if the overhead goes from 2%
  to 1.5% you haven't won much.
- shm should still use a well-isolated API/wrappers.  In fact I'd argue
  the API should be the same between sockets or shm.  Accessing shm
  directly without such wrappers is a recipe for crashes.
- shm doesn't imply locking.  Particularly, RCU might help.
- socket protocols should probably use some "standard" external encoding
  library, simply to be more usable from other programming languages.
- I don't see much gain from forcing all communication through a single
  point, but I do think we should use some uniform encoding & mechanism.
  If you use shm-based messaging, we should probably use that
  everywhere.  Same if you use protobuf over sockets, it should be
  protobuf over sockets everywhere.

NB: I'm not against SHM, but I do think SHM is more difficult to get
right, and it's not an automatic performance win.  I did some thinking
about a shared memory RCU-based replacement for ZAPI, but never had the
time to try that.  It probably *does* help moving Quagga towards
supporting multiple threads in the individual daemons.

  [quote moved]
> But, such exchange could be useful for other purpose like hot restart, 
> monitoring ... OSPF already provide such facility through the OSPF_API, 
> but it is dedicated to OSPFd only and we need to generalize it to other 
> Quagga daemon. From this API, we would take the capabilities to send 
> commands to a given process and get back some information, synchronously 
> (answer to the command) or asynchronously (LSA/LSP update).
> We study several option for the implementation and would get some advise 
> from the community before really start coding. Up to now, we have 
> identify 4 options:
> Option 1 and 2 have not our favour, but we are open to discussion. We 
> hesitate between option 3 and 4 and we appreciate greatly some advises 
> to help us making decision.

To be honest, I think this will need to be "evaluated" instead of
"decided".  Pick one, prototype implement it with the least effort
possible and show it.  You will have gained some insights from
implementing it, and we'll know how it performs...

... ultimately, this may be something that needs doing by trial & error,
I'm afraid.


More information about the Quagga-dev mailing list