[plug] Samba, delayed write failed, and can't become connected user

sothisistheinternet sothisistheinternet at gmail.com
Tue May 5 19:38:55 WST 2009


Hi again everyone,

I've finished my hardware diagnostics which included testing all hard
drives using the manufacturers' diagnostic utilities, checking (and
repairing as needed) file systems, and replacing the NIC, cables,
router, and switch. I've also done a reinstallation of FC10 and
additionally moved to the updates-testing version of Samba from the
FC10 repositories. The problem remains. However, careful observation
shows that while it can still occur apparently randomly during file
transfers, there are a few particular symtpoms that are making me
wonder about what is causing the samba server to 'trip'.

Here are the symptoms:

- When moving multiple large files (3 or more) or multiple large files
to different directories using several cut-paste operations in windows
xp on mapped drives pointing to samba shares, the samba server will
almost ALWAYS crash resulting in windows delayed write failed errors.
This is the most reproducible of the errors.
   - When such a failure occurs, ALL systems connected to ALL samba
shares are cut off.
   - When samba is restored (see below for more on this process) the
files have apparently finished copying, regardless of how far they
were into the copy process. They were not deleted from their original
directories though.

- When using windows xp to cut and paste large files (greater than
100mb) from one samba share drive to another (or at least another
partition on the same drive) there is a better than average chance of
a samba failure with delayed write failed errors
   - When such a failure occurs, ALL systems connected to ALL samba
shares are cut off.
   - When samba is restored (see below for more on this process) the
files have apparently finished copying, regardless of how far they
were into the copy process. They were not deleted from their original
directories though.

- When using file sharing software where the incoming and outgoing
share folders are located on the samba server (in different
partitions/drives for different outgoing shares) and are mapped drives
in windows xp the samba server will trip resulting in delayed write
failed errors - but not always: sometimes more than 24 hours can pass.

- When restoring the connections, there are two ways:
   - Often just pinging one of the client PCs from the FC10 samba
server is enough to restore all client PCs access to the samba shares
   - sudo service smb restart always results in a restore of samba
service to client PCs.

I'm wondering if this isn't a samba issue and is a disk or file system
management issue? All file systems are Ext3 (mkfs.ext3 -m 1 -j
/dev/sdc1). I'm particularly suspicious because the samba service does
not always need to be restarted to restore it (see the ping restore
symptom above) and, more importantly, because low levels of hard drive
access and windows directed transfer between samba shares on different
drives does no appear to be capable of causing the issue.

So, where should I report the bug? Samba.org? Fedora?

TIA,

Ari

On Thu, Apr 23, 2009 at 8:01 AM, sothisistheinternet
<sothisistheinternet at gmail.com> wrote:
> On Thu, Apr 23, 2009 at 7:59 AM, sothisistheinternet
> <sothisistheinternet at gmail.com> wrote:
>> On Wed, Apr 22, 2009 at 7:36 PM, sothisistheinternet
>> <sothisistheinternet at gmail.com> wrote:
>>> On Tue, Apr 21, 2009 at 7:17 PM, sothisistheinternet
>>> <sothisistheinternet at gmail.com> wrote:
>>>> Thanks Daniel,
>>>>
>>>> I've done a bunch of trouble shooting and have been able to rule the
>>>> switch out, but perhaps not the onboard NIC:
>>>
>>> Bought a PCI nic today. As of now, there's a new NIC and new CAT5
>>> cable and the samba server is connected directly to a router ethernet
>>> port, not the switch, as is the PC that's been having the most issues.
>>>
>>> Ari
>>>
>>
>> Happened again. Bug report time. I'm ruing this OS update :(
>
> PS - ping OUT to the windows client from the FC10 samba server results
> in everything working again.
>
>>
>> Ari
>>
>>
>>>>
>>>> I swapped in a different switch and cables - no effect
>>>> I used the manufacturer's diagnostics and checked the filesystems on
>>>> the drives - nothing wrong
>>>> Restarting smb fixes the issue - until then, no computers can connect
>>>> Problem began just after installing FC10 after having been using FC5 for ages
>>>>
>>>> Any other ideas would be very much appreciated. It's driving me nuts
>>>> and causing problems with my downloads.
>>>>
>>>> Ari
>>>>
>>>> On Tue, Apr 21, 2009 at 6:34 PM, Daniel Foote <freefoote at gmail.com> wrote:
>>>>> Hello.
>>>>>
>>>>>> Anyone? I made the FC10 system the master browser and thought that had
>>>>>> corrected the problem, think the problem was being caused by
>>>>>> elections. It was good for about 36 hours then during the night two
>>>>>> more episodes of 'delayed write failed' during PTP file transfers,
>>>>>> each require a sudo service smb restart to get samba to let other
>>>>>> computers on the network access it again.
>>>>>
>>>>> The last time I saw something like this, it was caused by a gigabit
>>>>> network card (windows client machine) on a 100mbit switch, connected
>>>>> to a Linux server (samba) with a 100mbit network card. Something
>>>>> related to the gigabit card, the cabling, or the switch caused it to
>>>>> 'micro-dropout' (I estimate 10ms or less) - just long enough to
>>>>> disconnect the Windows client from the samba server. In this case, it
>>>>> caused this Windows machine to emit the 'delayed write failed'
>>>>> message, and also released locks that the machine had on certain
>>>>> shared files, causing other issues.
>>>>>
>>>>> Not sure if that will help in your case... but something to consider.
>>>>>
>>>>> Daniel Foote.
>>>>> _______________________________________________
>>>>> PLUG discussion list: plug at plug.org.au
>>>>> http://www.plug.org.au/mailman/listinfo/plug
>>>>> Committee e-mail: committee at plug.linux.org.au
>>>>>
>>>>
>>>
>>
>



More information about the plug mailing list