[quagga-dev 10848] Conflict-resolution during BGP peer establishment

Vipin Kumar vipin at cumulusnetworks.com
Thu Oct 24 20:02:20 BST 2013


Hello,

During an effort to improve BGP bring up time, we came across the code that
deals with 'conflict resolution' as BGP from both ends of a peering starts
to open connections.

I tried my best and this is how I interpreted this code:

 1. Conflict resolution was not implemented completely as per the RFC.
      Because the code in quagga does a transfer from the realpeer(the peer
where BGP does connect) right away to the accepted peer, even if the
real-peer may not have gone through OpenSent and beyond. To be considered in
the conflict_resolution routine.

2. Then, because of the behavior caused by (1), looks like there was some
observed issue and a workaround fix as noted in the comment preceded by
'XXX: This is an awful problem'.

We are hitting this condition and it seems to delay the bring-up.

Given that, we were wondering if we should just fix (1), by letting the FSM
for the realpeer proceed until it goes to OpenConfirm and be considered
properly in the conflict resolution routine.

Before proceeding, we would like to know if there was an effort to fix (1)
as per the RFC ?
And if such efforts were blocked because of a serious code/design limitation
in quagga BGP. Code comment does hint a bit about how it may require a
significant change in the FSM.

If  you happen to have any history or insight about this, please share. So
we can avoid making a change that was tried and had hit some show-stopper in
the past.

Regards,
Vipin


bgpd/bgp_packet.c:

static int
bgp_open_receive (struct peer *peer, bgp_size_t size)
{

Š
Š

 /* Lookup peer from Open packet. */
  if (CHECK_FLAG (peer->sflags, PEER_STATUS_ACCEPT_PEER))
    {
      int as = 0;

      realpeer = peer_lookup_with_open (&peer->su, remote_as, &remote_id,
&as);

Š
Š

 /* When collision is detected and this peer is closed.  Retrun
     immidiately. */
  ret = bgp_collision_detect (peer, remote_id);
  if (ret < 0)
    return ret;

  /* Hack part. */
  if (CHECK_FLAG (peer->sflags, PEER_STATUS_ACCEPT_PEER))
    {
  if (realpeer->status == Established
      && CHECK_FLAG (realpeer->sflags, PEER_STATUS_NSF_MODE))
  {
    realpeer->last_reset = PEER_DOWN_NSF_CLOSE_SESSION;
    SET_FLAG (realpeer->sflags, PEER_STATUS_NSF_WAIT);
  }
  else if (ret == 0 && realpeer->status != Active
           && realpeer->status != OpenSent
     && realpeer->status != OpenConfirm
     && realpeer->status != Connect)
  {

    /* XXX: This is an awful problem..
     *
     * According to the RFC we should just let this connection (of the
     * accepted 'peer') continue on to Established if the other
     * connection (the 'realpeer' one) is in state Connect, and deal
     * with the more larval FSM as/when it gets far enough to receive
     * an Open. We don't do that though, we instead close the (more
     * developed) accepted connection.
     *
     * This means there's a race, which if hit, can loop:
     *
     *       FSM for A                        FSM for B
     *  realpeer     accept-peer       realpeer     accept-peer
     *
     *  Connect                        Connect
     *               Active
     *               OpenSent          OpenSent
     *               <arrive here,
     *               Notify, delete>
     *                                 Idle         Active
     *   OpenSent                                   OpenSent
     *                                              <arrive here,
     *                                              Notify, delete>
     *   Idle
     *   <wait>                        <wait>
     *   Connect                       Connect
     *
           *
     * If both sides are Quagga, they're almost certain to wait for
     * the same amount of time of course (which doesn't preclude other
     * implementations also waiting for same time). The race is
     * exacerbated by high-latency (in bgpd and/or the network).
     *
     * The reason we do this is because our FSM is tied to our peer
     * structure, which carries our configuration information, etc.
     * I.e. we can't let the accepted-peer FSM continue on as it is,
     * cause it's not associated with any actual peer configuration -
     * it's just a dummy.
     *
     * It's possible we could hack-fix this by just bgp_stop'ing the
     * realpeer and continueing on with the 'transfer FSM' below.
     * Ideally, we need to seperate FSMs from struct peer.
     *
     * Setting one side to passive avoids the race, as a workaround.
     */
    if (BGP_DEBUG (events, EVENTS))
      zlog_debug ("%s peer status is %s close connection",
      realpeer->host, LOOKUP (bgp_status_msg,
      realpeer->status));
    bgp_notify_send (peer, BGP_NOTIFY_CEASE,
         BGP_NOTIFY_CEASE_CONNECT_REJECT);

    return -1;
  }


      if (BGP_DEBUG (events, EVENTS))
  zlog_debug ("%s [Event] Transfer accept BGP peer to real (state %s)",
       peer->host,
       LOOKUP (bgp_status_msg, realpeer->status));

      bgp_stop (realpeer);

      /* Transfer file descriptor. */
      realpeer->fd = peer->fd;
      peer->fd = -1;

      /* Transfer input buffer. */
      stream_free (realpeer->ibuf);
      realpeer->ibuf = peer->ibuf;
      realpeer->packet_size = peer->packet_size;
      peer->ibuf = NULL;

      /* Transfer status. */
      realpeer->status = peer->status;
      bgp_stop (peer);

      /* peer pointer change. Open packet send to neighbor. */
      peer = realpeer;
      bgp_open_send (peer);
      if (peer->fd < 0)
  {
    zlog_err ("bgp_open_receive peer's fd is negative value %d",
        peer->fd);
    return -1;
  }
      BGP_READ_ON (peer->t_read, bgp_read, peer->fd);
    }



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quagga.net/pipermail/quagga-dev/attachments/20131024/7eac336e/attachment-0001.html>


More information about the Quagga-dev mailing list