[plug] crashing server
James Devenish
devenish at guild.uwa.edu.au
Tue Sep 14 20:28:09 WST 2004
In message <16710.57347.180888.888436 at pride.nsw.cmis.CSIRO.AU>
on Tue, Sep 14, 2004 at 10:11:47PM +1000, Rob Dunne wrote:
> the machine has to be rebooted.
Good grief. One possibly difficult aspect is that capturing kernel
coredumps after rebooting PC-type CPUs has apparently been unsupported
in the past (according to previous discussions on this list). I don't
know if Linux supports a crash-time debugger either. Not sure what
kernel developers do. By the way: what version of Linux (i.e. kernel)
are you using? How do you know the machine has to be rebooted? Does it:
spit a panic message to the console? lock up completely? spontaneously
reboot?
If you are using 64-bit hardware, perhaps the programme is triggering an
obscure kernel bug that only manifests on 64-bit hardware.
> > > free((void *)b);
The presence of a segfault on the above line strongly indicates that b
is invalid. If b has also been used as part of the mutex code as you
mentioned earlier, and if the kernel is failing to validate 'b' perhaps
this leads to the crash (i.e. it's a combination of both bugs that is
leading to the crash). Try fixing b and see if the crash stops. If the
kernel is corrupting b, well, you may have to resort to tracking the
value of b through the code (either with a debugger or with in-code
methods).
More information about the plug
mailing list