[quagga-dev 8508] Re: Node lock in bgp_table.c prevents proper node deletion

Chris Caputo ccaputo at alt.net
Tue Feb 8 20:03:27 GMT 2011


Hi Barry,

One test that I remember is to do a "no router bgp <asn>" followed by 
"router bgp <asn>" and a BGP config, and then repeating that.  Ie., tear 
down and setup, without restarting bgpd, should work.  Also, the "debug 
bgp" inspired exit report should show no leaks.

Thanks,
Chris

On Tue, 8 Feb 2011, Barry Friedman wrote:
> Hi Nick and Chris,
> 
> Thanks for your feedback, I'll send a follow-up email with a more
> detailed analysis. It seems like we could use a unit test for this
> module so maybe this is a good excuse to write and add one. I did do a
> fair bit of manual testing with BGP adding/updating and withdrawing
> routes from the table and checking the built-in memory accounting for
> errors and did not see any problems but I did not test with valgrind.
> It would be good to know how to create the error that motivated the
> addition of the lock. If I write a unit test I can exercise the code
> pretty well and then run it with valgrind and a debugging mem
> allocator like tcmalloc. That would be better than just manual testing
> with BGP updates.
> 
> A couple of other points:
> - The bgp table locks a node when it returns a pointer to the node to
> the caller. It is up to the caller to initiate unlocking of the node.
> In this case we have a node that is not returned to the caller so the
> addition of the node lock prevents proper operation of the table
> because it cannot be unlocked. This causes the problem I described and
> I verified that the lock removal restores proper operation.
> - If this lock is really needed, then it should also be needed in
> lib/table.c:route_node_get() which is widely used by other daemons. We
> should also test and check this module using valgrind etc.
> 
> Thanks,
> Barry
> 
> On Tue, Feb 8, 2011 at 6:30 AM, Nick Hilliard <nick at inex.ie> wrote:
> > On 08/02/2011 03:25, Chris Caputo wrote:
> >>
> >> Maybe some other things have been fixed since July '09 to alleviate the
> >> need for this bgp_lock_node() in bgp_node_get(), but I am pretty confident
> >> it was needed back then and nervous about its removal.
> >
> > I'd like to second that sentiment.  Removing structure locks without a
> > thorough description of exactly why the lock is unnecessary is likely to be
> > a bad idea.
> >
> > Nick


More information about the Quagga-dev mailing list