[quagga-dev 11579] Re: [PATCH] BGP: add aspath_aggregate_mpath that preserves path length
paul at jakma.org
Tue Oct 14 14:21:56 BST 2014
On Sat, 11 Oct 2014, Boian Bonev wrote:
> I have started fiddling around BGP ECMP long time ago because of a quite
> non-trivial real life problem - balancing two non-diverse upstream ISPs
> without having to manually set path preference by individual AS (for all
> live ASes). By diverse I mean offering different best paths.
> In many cases it is common that all upstream tier2 ISPs will have direct
> connection to the same set of tier1s thus advertising the same set of
> bestpaths. This may easily go over 50% of all routes. If BGP is setup to
> select between peers on itself (without any forced preference) then most
> of the outgoing bandwidth will go to the longest established session,
> most probably saturating this link while the others are mostly idle. If
> that session flaps, then it will swap with next session, but the overall
> result will be the same.
Ah, interesting. Ok, so really ECMP at a forwarding level driving this.
There has been some support added to Quagga for ECMP in BGP. I havn't
played with it yet though.
I take it you're also trying to re-advertise these paths to other plain
BGP speakers, given your patch?
> Without ECMP the most trivial way is to set different preference to two
> upstreams by dividing originating AS e.g. to odd/even. Then you will
> discover that odd ASes generate nearly 2 times more traffic than even
> ones which is quite odd by itself. Trying to keep the scales near flat
> will lead to a day to day increasing preference lists that will become
> unmanageable soon.
There are likely power-laws at work, meaning it will always be hard to
find a balance that way. Indeed, it could even be impossible (1 AS might
be responsible for very disproportionate amount of traffic).
> It looks interesting to me in the route-server case where (EC)MP may get
> propagated from an IXP to more than one router in an AS so each one can
> make different decision based on its policy/proximity/etc. But then what
> about the route-map syntax that can match and manipulate multiple paths...
That will need to be extended, yes.
> I believe that the interop with non-ECMP speakers will be important at
> least until IPv6 gets wider adoption, so most of the routers will have
> to be changed/upgraded anyways and will hopefuly have some Add-Path
> support as a side effect :)
> I have hit the scaling issues and can say that for current Q they are
> present in non route-server setups as well - just put 6 full table
> upstreams, 4 sessions with 200k routes and 50 downlinks most of which
> get only default route and the only way to make it tick is to enable one
> session every 10 minutes. Afterwards if an upstream flaps, all sessions
> will start oscillating in established-timeout cycle because processing
> will take more than 5 minutes and more sessions will flap causing others
> to flap too... In a route server scenario it is much worse.
We'll have to try fix that so.
Hacks are possible, running multiple bgpds each listening on a separate
IP, and splitting the load over them. Then have a non-public-facing bgpd
to integrate all the routes and do best-selection (if you have a
config/control system that can take care of the config overhead), and also
needs something like Add-Path to avoid MED probs.
> I have also looked into the Euro-IX branch (and have successfuly rebased
> some patches) which mitigates some of these scalability issues, but the
> differences from the main Q tree are so severe and unsplittable that I
> see no easy way for a clean merge process without significant
Yes. They went away and did lots of work, but basically forked.
There is a question as to whether it'd be better to thread bgpd as EuroIX
has done, or to just run slave bgpds as separate processes off a main
bgpd. One obstacle to the latter would be dealing with config & control.
I'm minded to the latter.
Paul Jakma paul at jakma.org @pjakma Key ID: 64A2FF6A
A visit to a fresh place will bring strange work.
More information about the Quagga-dev