[plug] Dying Samba Server

Daniel Pearson (Flashware Solutions) daniel at flashware.net
Thu Nov 2 16:09:00 WST 2006


Tomasz Grzegurzko wrote:
> On 11/2/06, Daniel Pearson (Flashware Solutions) 
> <daniel at flashware.net> wrote:
>> Daniel Pearson (Flashware Solutions) wrote:
>> > Tomasz Grzegurzko wrote:
>> >> On 11/2/06, Daniel Pearson (Flashware Solutions)
>> >> <daniel at flashware.net> wrote:
>> >>>
>> >>>  Ok, so I've got a box running Ubuntu Server running kernel
>> >>> 2.6.15-23-amd64
>> >>> (that I'm using as a Samba DC/FS) - and in recent weeks it seems to
>> >>> have
>> >>> just completed halted at least once a week.
>> >>>
>> >>>  I've had a look through /var/log/messages kern.log and syslog and
>> >>> can't
>> >>> seem to find any 'error' messages in there.. where else should I be
>> >>> looking?
>> >>>
>> >>>  TIA; Dan
>> >>>
>> >> Is there anything on the console when it "goes"? Kernel panics, HDD
>> >> read errors, anything like that? I've found such `hard' lockups are
>> >> usually the result of hardware failures but narrowing them down may
>> >> require a peek at the console to verify if anything happened before
>> >> that freeze.
>> > That were my thoughts, also - there's no monitor attached to it, and
>> > I'm never anywhere near it when it does.. it sits remotely..
>> >
>> > Is there a way to cat * in /var/log and search for 'error' and/or 
>> 'hda' ?
>> >
>> Ok, a grep of messages.0 for 'hda' shows..
>>
>>
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009401.362845] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009421.432845] hda:
>> dma_timer_expiry: dma status == 0x21
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.424727] hda: DMA
>> timeout error
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.434985] hda: dma
>> timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.499423] hda: status
>> error: status=0x50 { DriveReady SeekComplete }
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009431.637266] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009451.708247] hda:
>> dma_timer_expiry: dma status == 0x21
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.700129] hda: DMA
>> timeout error
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.710460] hda: dma
>> timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.776676] hda: status
>> error: status=0x50 { DriveReady SeekComplete }
>> Oct 23 11:53:51 flashware-svr01 kernel: [1009461.911689] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 23 14:56:58 flashware-svr01 kernel: [    0.000000] Bootdata ok
>> (command line is root=/dev/hda1 ro quiet splash)
>> Oct 23 14:56:58 flashware-svr01 kernel: [    0.000000] Kernel command
>> line: root=/dev/hda1 ro quiet splash
>> Oct 23 14:56:58 flashware-svr01 kernel: [   13.774740]     ide0: BM-DMA
>> at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
>> Oct 23 14:56:58 flashware-svr01 kernel: [   14.226099] hda: ST3200827A,
>> ATA DISK drive
>> Oct 23 14:56:58 flashware-svr01 kernel: [   16.605512] hda: max request
>> size: 1024KiB
>> Oct 23 14:56:58 flashware-svr01 kernel: [   16.651328] hda: 390721968
>> sectors (200049 MB) w/8192KiB Cache, CHS=24321/255/63, UDMA(100)
>> Oct 23 14:56:58 flashware-svr01 kernel: [   16.675675] hda: cache
>> flushes supported
>> Oct 23 14:56:58 flashware-svr01 kernel: [   16.675732]  hda: hda1 hda2 <
>> hda5 >
>> Oct 23 14:56:58 flashware-svr01 kernel: [   19.853766] EXT3-fs: hda1:
>> orphan cleanup on readonly fs
>> Oct 23 14:56:58 flashware-svr01 kernel: [   19.889744] EXT3-fs: hda1: 2
>> orphan inodes deleted
>> Oct 23 14:56:58 flashware-svr01 kernel: [   26.848552] Adding 2988048k
>> swap on /dev/hda5.  Priority:-1 extents:1 across:2988048k
>> Oct 23 14:56:58 flashware-svr01 kernel: [   26.993837] EXT3 FS on hda1,
>> internal journal
>> Oct 23 20:35:23 flashware-svr01 kernel: [20429.780772] hda:
>> dma_timer_expiry: dma status == 0x21
>> Oct 23 20:35:34 flashware-svr01 kernel: [20439.772416] hda: DMA timeout
>> error
>> Oct 23 20:35:34 flashware-svr01 kernel: [20439.778841] hda: dma timeout
>> error: status=0x58 { DriveReady SeekComplete DataRequest }
>> Oct 23 20:35:34 flashware-svr01 kernel: [20439.824050] hda: status
>> error: status=0x50 { DriveReady SeekComplete }
>> Oct 23 20:35:34 flashware-svr01 kernel: [20439.949375] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 23 20:35:54 flashware-svr01 kernel: [20460.005494] hda:
>> dma_timer_expiry: dma status == 0x21
>> Oct 23 20:36:04 flashware-svr01 kernel: [20469.997137] hda: DMA timeout
>> error
>> Oct 23 20:36:04 flashware-svr01 kernel: [20470.006199] hda: dma timeout
>> error: status=0x58 { DriveReady SeekComplete DataRequest }
>> Oct 23 20:36:04 flashware-svr01 kernel: [20470.069646] hda: status
>> error: status=0x50 { DriveReady SeekComplete }
>> Oct 23 20:36:04 flashware-svr01 kernel: [20470.203815] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 23 20:36:24 flashware-svr01 kernel: [20490.270181] hda:
>> dma_timer_expiry: dma status == 0x21
>> Oct 23 20:36:34 flashware-svr01 kernel: [20500.261823] hda: DMA timeout
>> error
>> Oct 23 20:36:34 flashware-svr01 kernel: [20500.272127] hda: dma timeout
>> error: status=0x58 { DriveReady SeekComplete DataRequest }
>> Oct 23 20:36:34 flashware-svr01 kernel: [20500.337236] hda: status
>> error: status=0x50 { DriveReady SeekComplete }
>> Oct 23 20:36:34 flashware-svr01 kernel: [20500.478237] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 23 20:36:54 flashware-svr01 kernel: [20520.544860] hda:
>> dma_timer_expiry: dma status == 0x21
>> Oct 23 20:37:05 flashware-svr01 kernel: [20530.536504] hda: DMA timeout
>> error
>> Oct 23 20:37:05 flashware-svr01 kernel: [20530.546891] hda: dma timeout
>> error: status=0x58 { DriveReady SeekComplete DataRequest }
>> Oct 23 20:37:05 flashware-svr01 kernel: [20530.609751] hda: status
>> error: status=0x50 { DriveReady SeekComplete }
>> Oct 23 20:37:05 flashware-svr01 kernel: [20530.742668] hda: status
>> timeout: status=0xd0 { Busy }
>> Oct 26 19:51:25 flashware-svr01 kernel: [276774.505573] hda: status
>> timeout: status=0x80 { Busy }
>> _______________________________________________
>> PLUG discussion list: plug at plug.org.au
>> http://www.plug.org.au/mailman/listinfo/plug
>> Committee e-mail: committee at plug.linux.org.au
>>
>
>
> That is either a bad HDD or memory. The reason I say that is because
> I/O operations work like this: HDD->memory, memory->CPU etc. So if the
> RAM is bad, it will *look* like HDD errors. Though this is quite
> telltale, more than likely HDD problems. You could try disabling DMA
> (# hdparm -d 0 /dev/hda); see if that helps.
>
> See how you go.
> Tomasz
root at flashware-svr01:/var/log# hdparm -d 0 /dev/hda

/dev/hda:
 setting using_dma to 0 (off)
 using_dma    =  0 (off)


Done.... I'll keep you updated as more info comes to hand!

Cheers; Dan



More information about the plug mailing list