[quagga-dev 260] Re: [quagga-users 491] Re: Hardware Spec for a route server

Paul Jakma paul at clubi.ie
Mon Sep 29 12:09:49 BST 2003

On Mon, 29 Sep 2003, Gilad Arnold wrote:

> (a lot of profiling output I can't read)
> What's that output? 


> Any legend / manpage I can refer to?

hmm.. best thing to consult is the gprof info page -> pinfo gprof.

> What and how is it measuring a process' performance, wrt user time,
> kernel time and overall time?

its solely user time, afaik. so it doesnt account for kernel or wall 
clock time.

> Doesn't seem too strange to me: since we're not deleting any
> addresses / connected routes, rib_update is not involved, hence
> add/del operations should be rather equivalent, don't you think?

guess so :)

> Yes, but you should also consider the context switch for read
> operation, the kernel overhead (buffer copying, socket management,
> etc),

yes, but it should be fairly efficient though - unix sockets are
/very/ efficient. (eg the X Window System protocol).

> and so on;  and all that, just because we don't use a "killall"
> message instead...

yeah, that would be useful.

> Well, bear in mind that using a bulk deletion is much faster in
> that sense: you don't do prefix_match per each route (something
> like O(logn)  each, total of O(n*logn)) but rather scan the whole
> tree once (i.e. O(n)  total); so there's some benefit by that
> manner as well.

yes. that would save a good bit of work in itself.

> Yes, should be very easy to do in zebra, less easy in protocol
> daemons (requires a specialized event for these case, e.g. peer
> drop in bgpd, neighbor timeout in ospfd, etc).

those are the most stressful events. in normal operation, bgpd and 
zebra dont use /too/ much cpu. zebra can hover around 5 to 10% due 
to the steadyish update/withdrawal traffic found in DFZ. bgpd 1 to 
4ish %. peer events are what cause problems, which could /really/ be 
of use.

> The 'or' is there because section 2 is far too general, less
> feasible nor required, IMO... (mainly I put it in order to better
> support my point ;-> but really, I don't think it's worth designing
> and implementing, not at the moment)


> As I said, the profiling conclusions make sense to me (that is, in
> and out are the same), so you may have a point here: I guess it
> could be easily tested with a simple script that adds some 10-20K
> static routes via iproute2, then delete them, with proper
> timestamps. What do you say?

i'll give it a go.

> Okay, so that's not a problem (bgpd doesn't drop peers anymore?
> very nice!)

I didnt say that :)

My bgpd patch is still WIP - but i'd like some feedback on the 
workqueue patch and some testing with the zebra rib_process -> work 
queue patch. Some very rough testing suggests it works (at least it 
survived a day of running with a bgpd client that had a live feed). 
But i'd like to make sure the wq side of things is good - but no one 
has tested it.

but that patch provides the framework by which we can eliminate 
bgpd's peer-drop problem, yes.

another nasty bgpd problem (i /suspect/): type 'show ip bgp' so that
you have the beginnings of your large bgp table displayed to you in a 
pager. now leave it that way. I /suspect/ this blocks bgpd. 

> >  add prefix y via x, sequence 1.
> >  delete prefix y, seq. 2 
> >  add prefix y via z, seq 3.
> >  <ack arrives for 1>
> >  reconcile 1
> >  <ack 3>
> >  reconcile 3.
> >  .
> >  .
> >  .
> >  scan pending-acks, find seq 2 unacked and due for retransmit.
> >  delete prefix y, seq. 4.
> > 
> > prefix y is gone.

> Good example, although this isn't the real problem, IMO: the
> vanishing of prefix y could be prevented rather easily if we used
> explicit pointers to RIB elements as the callback info -- when the
> handler receives an ack/nack, all it needs to do is raise the 'fib'
> flag of the corresponding nexthop elements. 

i dont follow.. 

> Nonetheless, it is
> probably required that any further modifications to that route
> node's routes should be delayed until that ack arrives,


> and how this is to be implemented is a good question -- locally, as
> a per-node waiting list (probably most efficient, in terms of
> utilizing kernel delays), or globally, as a RIB-wide waiting list
> (simpler, yet inefficient)? In other words, there needs to be some
> locking and queueing for non-acked routes.

well, the work queue supports actions. so the 'work function' (the
function supplied by whoever setup the queue to process each item in
the queue) can return REQUEUE to tell the work queue layer to simply
requeue this item at the back of the queue. so the work function
(which processes each 'queue item') would first look at the node, if
ack pending -> return REQUEUE and the work queue code takes care of
requeueing.  Alternatively, it can return ERROR and the work queue
code can run the error handler which was specified for the queue.

> No, just process the next route update... ;->  (maybe I misunderstood 
> the question?)

that was the question :)

> As mentioned, Monday morning after a long weekend.


> Gilad

Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
	warning: do not ever send email to spam at dishone.st
Nothing is rich but the inexhaustible wealth of nature.
She shows us only surfaces, but she is a million fathoms deep.
		-- Ralph Waldo Emerson

More information about the Quagga-dev mailing list