[plug] re: server freezing

Craig Ringer craig at postnewspapers.com.au
Wed Jul 9 19:36:31 WST 2003


> [root at gfpmsql root]# /sbin/lspci 
> 00:00.0 Host bridge: ServerWorks: Unknown device 0012 (rev 13)

Try updating your pci.ids database from pciids.sourceforge.net . It 
doesn't really change much, but it makes lspci output much easier to 
work with and identify.

>>>[root at gfpmsql root]# cat /proc/interrupts 
> 
>            CPU0       CPU1       CPU2       CPU3       
>   0:   16506895   16507517   16507378   16507241    IO-APIC-edge  timer
>   1:          1          1          1          1    IO-APIC-edge 
> keyboard
>   2:          0          0          0          0          XT-PIC 
> cascade
>   8:          1          0          0          0    IO-APIC-edge  rtc
>  11:          0          0          0          0   IO-APIC-level 
> usb-ohci
>  12:         12          8          8         13    IO-APIC-edge  PS/2
> Mouse
>  14:          1          0          0          1    IO-APIC-edge  ide0
>  18:       8502       8519       8504       8532   IO-APIC-level  eth0
>  20:          4          4          4          4   IO-APIC-level 
> aic7xxx
>  22:       5211       5230       5251       5209   IO-APIC-level  ips
>  29:     332445     332628     332689     332663   IO-APIC-level  eth1
> NMI:          0          0          0          0 
> LOC:   66029161   66029191   66029191   66029191 
> ERR:          0
> MIS:          0

All looks reasonable. I can tell right off that you don't use your 
onboard SCSI or ATA interfaces :-) More importantly, nothing is trying 
to share an interrupt or anything silly like that.

>>>[root at gfpmsql root]# uname -a
> 
> Linux gfpmsql 2.4.18-14smp #1 SMP Wed Sep 4 12:34:47 EDT 2002 i686 i686
> i386 GNU/Linux

Try updating to 2.4.20, especially as you have a fairly recent chipset 
(newer than your kernel and distro at least). Sometimes chipset-specific 
tweaks and fixes get in. It shouldn't be needed, but sometimes it is.

> While gathering this information the server froze, no errors nor
> messages.

That's just damnn strange. As I said before, you should probably look at 
talking to its IPMI interface to ask it if if thinks there's anything 
wrong. Unfortunately, I think this requires kernel patches to 2.4.x .

The only other thing I can suggest now is to try running on a 
uniprocessor kernel - just to see. It's remotely possible that you're 
encountering a problem with a driver that has locking problems or 
something similar. Depends, really, on if the server can take the 
workload on only one processor with HyperThreading disabled.

Craig Ringer




More information about the plug mailing list