[quagga-dev 11579] Re: [PATCH] BGP: add aspath_aggregate_mpath that preserves path length

Paul Jakma paul at jakma.org
Tue Oct 14 14:21:56 BST 2014

On Sat, 11 Oct 2014, Boian Bonev wrote:

> I have started fiddling around BGP ECMP long time ago because of a quite 
> non-trivial real life problem - balancing two non-diverse upstream ISPs 
> without having to manually set path preference by individual AS (for all 
> live ASes). By diverse I mean offering different best paths.
> In many cases it is common that all upstream tier2 ISPs will have direct 
> connection to the same set of tier1s thus advertising the same set of 
> bestpaths. This may easily go over 50% of all routes. If BGP is setup to 
> select between peers on itself (without any forced preference) then most 
> of the outgoing bandwidth will go to the longest established session, 
> most probably saturating this link while the others are mostly idle. If 
> that session flaps, then it will swap with next session, but the overall 
> result will be the same.

Ah, interesting. Ok, so really ECMP at a forwarding level driving this.

There has been some support added to Quagga for ECMP in BGP. I havn't 
played with it yet though.

I take it you're also trying to re-advertise these paths to other plain 
BGP speakers, given your patch?

> Without ECMP the most trivial way is to set different preference to two 
> upstreams by dividing originating AS e.g. to odd/even. Then you will 
> discover that odd ASes generate nearly 2 times more traffic than even 
> ones which is quite odd by itself. Trying to keep the scales near flat 
> will lead to a day to day increasing preference lists that will become 
> unmanageable soon.

There are likely power-laws at work, meaning it will always be hard to 
find a balance that way. Indeed, it could even be impossible (1 AS might 
be responsible for very disproportionate amount of traffic).

> It looks interesting to me in the route-server case where (EC)MP may get
> propagated from an IXP to more than one router in an AS so each one can
> make different decision based on its policy/proximity/etc. But then what
> about the route-map syntax that can match and manipulate multiple paths...

That will need to be extended, yes.

> I believe that the interop with non-ECMP speakers will be important at 
> least until IPv6 gets wider adoption, so most of the routers will have 
> to be changed/upgraded anyways and will hopefuly have some Add-Path 
> support as a side effect :)


> I have hit the scaling issues and can say that for current Q they are 
> present in non route-server setups as well - just put 6 full table 
> upstreams, 4 sessions with 200k routes and 50 downlinks most of which 
> get only default route and the only way to make it tick is to enable one 
> session every 10 minutes. Afterwards if an upstream flaps, all sessions 
> will start oscillating in established-timeout cycle because processing 
> will take more than 5 minutes and more sessions will flap causing others 
> to flap too... In a route server scenario it is much worse.

We'll have to try fix that so.

Hacks are possible, running multiple bgpds each listening on a separate 
IP, and splitting the load over them. Then have a non-public-facing bgpd 
to integrate all the routes and do best-selection (if you have a 
config/control system that can take care of the config overhead), and also 
needs something like Add-Path to avoid MED probs.

> I have also looked into the Euro-IX branch (and have successfuly rebased 
> some patches) which mitigates some of these scalability issues, but the 
> differences from the main Q tree are so severe and unsplittable that I 
> see no easy way for a clean merge process without significant 
> refactoring.

Yes. They went away and did lots of work, but basically forked.

There is a question as to whether it'd be better to thread bgpd as EuroIX 
has done, or to just run slave bgpds as separate processes off a main 
bgpd. One obstacle to the latter would be dealing with config & control. 
I'm minded to the latter.

Paul Jakma	paul at jakma.org	@pjakma	Key ID: 64A2FF6A
A visit to a fresh place will bring strange work.

More information about the Quagga-dev mailing list