[plug] Dying Samba Server

Tomasz Grzegurzko tomasz89 at gmail.com
Thu Nov 2 16:04:41 WST 2006


On 11/2/06, Daniel Pearson (Flashware Solutions) <daniel at flashware.net> wrote:
> Daniel Pearson (Flashware Solutions) wrote:
> > Tomasz Grzegurzko wrote:
> >> On 11/2/06, Daniel Pearson (Flashware Solutions)
> >> <daniel at flashware.net> wrote:
> >>>
> >>>  Ok, so I've got a box running Ubuntu Server running kernel
> >>> 2.6.15-23-amd64
> >>> (that I'm using as a Samba DC/FS) - and in recent weeks it seems to
> >>> have
> >>> just completed halted at least once a week.
> >>>
> >>>  I've had a look through /var/log/messages kern.log and syslog and
> >>> can't
> >>> seem to find any 'error' messages in there.. where else should I be
> >>> looking?
> >>>
> >>>  TIA; Dan
> >>>
> >> Is there anything on the console when it "goes"? Kernel panics, HDD
> >> read errors, anything like that? I've found such `hard' lockups are
> >> usually the result of hardware failures but narrowing them down may
> >> require a peek at the console to verify if anything happened before
> >> that freeze.
> > That were my thoughts, also - there's no monitor attached to it, and
> > I'm never anywhere near it when it does.. it sits remotely..
> >
> > Is there a way to cat * in /var/log and search for 'error' and/or 'hda' ?
> >
> Ok, a grep of messages.0 for 'hda' shows..
>
>
> Oct 23 11:53:51 flashware-svr01 kernel: [1009401.362845] hda: status
> timeout: status=0xd0 { Busy }
> Oct 23 11:53:51 flashware-svr01 kernel: [1009421.432845] hda:
> dma_timer_expiry: dma status == 0x21
> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.424727] hda: DMA
> timeout error
> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.434985] hda: dma
> timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.499423] hda: status
> error: status=0x50 { DriveReady SeekComplete }
> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.637266] hda: status
> timeout: status=0xd0 { Busy }
> Oct 23 11:53:51 flashware-svr01 kernel: [1009451.708247] hda:
> dma_timer_expiry: dma status == 0x21
> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.700129] hda: DMA
> timeout error
> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.710460] hda: dma
> timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.776676] hda: status
> error: status=0x50 { DriveReady SeekComplete }
> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.911689] hda: status
> timeout: status=0xd0 { Busy }
> Oct 23 14:56:58 flashware-svr01 kernel: [    0.000000] Bootdata ok
> (command line is root=/dev/hda1 ro quiet splash)
> Oct 23 14:56:58 flashware-svr01 kernel: [    0.000000] Kernel command
> line: root=/dev/hda1 ro quiet splash
> Oct 23 14:56:58 flashware-svr01 kernel: [   13.774740]     ide0: BM-DMA
> at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
> Oct 23 14:56:58 flashware-svr01 kernel: [   14.226099] hda: ST3200827A,
> ATA DISK drive
> Oct 23 14:56:58 flashware-svr01 kernel: [   16.605512] hda: max request
> size: 1024KiB
> Oct 23 14:56:58 flashware-svr01 kernel: [   16.651328] hda: 390721968
> sectors (200049 MB) w/8192KiB Cache, CHS=24321/255/63, UDMA(100)
> Oct 23 14:56:58 flashware-svr01 kernel: [   16.675675] hda: cache
> flushes supported
> Oct 23 14:56:58 flashware-svr01 kernel: [   16.675732]  hda: hda1 hda2 <
> hda5 >
> Oct 23 14:56:58 flashware-svr01 kernel: [   19.853766] EXT3-fs: hda1:
> orphan cleanup on readonly fs
> Oct 23 14:56:58 flashware-svr01 kernel: [   19.889744] EXT3-fs: hda1: 2
> orphan inodes deleted
> Oct 23 14:56:58 flashware-svr01 kernel: [   26.848552] Adding 2988048k
> swap on /dev/hda5.  Priority:-1 extents:1 across:2988048k
> Oct 23 14:56:58 flashware-svr01 kernel: [   26.993837] EXT3 FS on hda1,
> internal journal
> Oct 23 20:35:23 flashware-svr01 kernel: [20429.780772] hda:
> dma_timer_expiry: dma status == 0x21
> Oct 23 20:35:34 flashware-svr01 kernel: [20439.772416] hda: DMA timeout
> error
> Oct 23 20:35:34 flashware-svr01 kernel: [20439.778841] hda: dma timeout
> error: status=0x58 { DriveReady SeekComplete DataRequest }
> Oct 23 20:35:34 flashware-svr01 kernel: [20439.824050] hda: status
> error: status=0x50 { DriveReady SeekComplete }
> Oct 23 20:35:34 flashware-svr01 kernel: [20439.949375] hda: status
> timeout: status=0xd0 { Busy }
> Oct 23 20:35:54 flashware-svr01 kernel: [20460.005494] hda:
> dma_timer_expiry: dma status == 0x21
> Oct 23 20:36:04 flashware-svr01 kernel: [20469.997137] hda: DMA timeout
> error
> Oct 23 20:36:04 flashware-svr01 kernel: [20470.006199] hda: dma timeout
> error: status=0x58 { DriveReady SeekComplete DataRequest }
> Oct 23 20:36:04 flashware-svr01 kernel: [20470.069646] hda: status
> error: status=0x50 { DriveReady SeekComplete }
> Oct 23 20:36:04 flashware-svr01 kernel: [20470.203815] hda: status
> timeout: status=0xd0 { Busy }
> Oct 23 20:36:24 flashware-svr01 kernel: [20490.270181] hda:
> dma_timer_expiry: dma status == 0x21
> Oct 23 20:36:34 flashware-svr01 kernel: [20500.261823] hda: DMA timeout
> error
> Oct 23 20:36:34 flashware-svr01 kernel: [20500.272127] hda: dma timeout
> error: status=0x58 { DriveReady SeekComplete DataRequest }
> Oct 23 20:36:34 flashware-svr01 kernel: [20500.337236] hda: status
> error: status=0x50 { DriveReady SeekComplete }
> Oct 23 20:36:34 flashware-svr01 kernel: [20500.478237] hda: status
> timeout: status=0xd0 { Busy }
> Oct 23 20:36:54 flashware-svr01 kernel: [20520.544860] hda:
> dma_timer_expiry: dma status == 0x21
> Oct 23 20:37:05 flashware-svr01 kernel: [20530.536504] hda: DMA timeout
> error
> Oct 23 20:37:05 flashware-svr01 kernel: [20530.546891] hda: dma timeout
> error: status=0x58 { DriveReady SeekComplete DataRequest }
> Oct 23 20:37:05 flashware-svr01 kernel: [20530.609751] hda: status
> error: status=0x50 { DriveReady SeekComplete }
> Oct 23 20:37:05 flashware-svr01 kernel: [20530.742668] hda: status
> timeout: status=0xd0 { Busy }
> Oct 26 19:51:25 flashware-svr01 kernel: [276774.505573] hda: status
> timeout: status=0x80 { Busy }
> _______________________________________________
> PLUG discussion list: plug at plug.org.au
> http://www.plug.org.au/mailman/listinfo/plug
> Committee e-mail: committee at plug.linux.org.au
>


That is either a bad HDD or memory. The reason I say that is because
I/O operations work like this: HDD->memory, memory->CPU etc. So if the
RAM is bad, it will *look* like HDD errors. Though this is quite
telltale, more than likely HDD problems. You could try disabling DMA
(# hdparm -d 0 /dev/hda); see if that helps.

See how you go.
Tomasz



More information about the plug mailing list