[quagga-dev 7555] Re: BGPd Crash

Chris Caputo ccaputo at alt.net
Wed Dec 16 20:22:29 GMT 2009


On linux w/gcc I add these flags to then end of my normal "configure" line 
for debug:

  --disable-shared --disable-pie --enable-gcc-rdynamic               \
  --with-cflags="-ggdb -fno-omit-frame-pointer -g -std=gnu99 -Wall   \
  -Wsign-compare -Wpointer-arith -Wbad-function-cast -Wwrite-strings \
  -Wmissing-prototypes -Wmissing-declarations -Wchar-subscripts      \
  -Wcast-qual"

Performance wise, assuming a modern multi-gigahertz box, I think you'll be 
okay, but if you want to be cautious this first try, include an "-O2" in 
that cflags line.

If running bgpd with "-d" flag, you should find a core dump in "/" (root 
dir) if "ulimit -c unlimited" was done before daemon start.  Without "-d" 
flag, I expect it will be in the directory from which you started the 
process.

The "echo 1 >> /proc/sys/kernel/core_uses_pid" before daemon start 
prevents core file name collisions.  "file" on a core file should reveal 
the command line, providing a way to double check which core it is.  Ex:

  # file /core.9833
  /core.9833: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, 
  from '/usr/local/sbin/bgpd -d -A 127.0.0.1'

If you sent the existing executable, I haven't yet received it.  It might 
be too big, such that you'll need to post it somewhere.

Thanks!
Chris

On Wed, 16 Dec 2009, Richard J Palmer wrote:
> Thanks for this
> 
> I can't see a core file unfortunately (I'm assuming it would be in the 
> same directory as bgpd itself ?
> 
> Happy to re-compile with debug but to ensure I get what you need can you 
> confirm what flag you want setup at compile time ? Also would this have 
> any implications on a reasonably busy quagga box ?
> 
> Happy to email the executable any anything you need from when I built 
> the executable (I still have the build folder) I will email this 
> directly to you in a second
> 
> 
> 
> Chris Caputo wrote:
> > Hey Richard,
> >
> > I am sorry we haven't figured this one out yet, especially since you have
> > been hitting it periodically since August.
> >
> > Do you have a core file dump for bgpd for this by any chance?
> >
> > If yes, I am hoping to see a better backtrace possibly from gdb.
> >
> > If no, can you re-compile with debugging enabled and enable core dumps for
> > the next time you restart.  As an example, I include the following in my
> > quagga rc file, to do this:
> >
> >         # enable core dumps
> >         ulimit -c unlimited
> >         echo 1 >> /proc/sys/kernel/core_uses_pid
> >         [...]
> >         # bgpd daemon start...
> >
> > I've noticed bgp_read+0xa72 has consistently been in the backtrace, but I'd
> > like to know what the call path from there to bgp_unlock_node is.  If you'd
> > like to email me your current bgpd executable, I can try to see what I can
> > figure out from the symbol table.
> >
> > Also, any chance you have the disk space to enable full BGP debug logging?
> > I understand it can take weeks to hit this bug, but the verbose debug log
> > just prior to this happening could prove useful.
> >
> > As an aside, I am not sure if this is the result of my big July patch
> > screwing something up, or whether the addition of the above assert(), in
> > that same July patch, has just highlighted an uncorrected problem.
> > Hopefully the latter, of course.  :-)
> >
> > Thanks,
> > Chris
> > _______________________________________________
> > Quagga-dev mailing list
> > Quagga-dev at lists.quagga.net
> > http://lists.quagga.net/mailman/listinfo/quagga-dev
> >   
> 
> 



More information about the Quagga-dev mailing list