[plug] evil performance on large directories.

Bernd Felsche bernie at innovative.iinet.net.au
Tue Apr 5 20:41:53 WST 2005


Shayne O'Neill <shayne at guild.murdoch.edu.au> writes:

>The server I run at the murdoch guild has a couple of monster directories.
>The shared student drive where our lovely student reps have promptly
>dumped 10 million cubic tonnes of debris all into the root , and the www
>directory (ugh. webdesigners... We now have a cms that puts an end to
>that, but the debris is all there still)

>Regardless, any operations that require a listing seem to take ages. ls
>can pause for up to two minutes. connecting via appletalk takes ages, and
>the macs which habitually ask for full directory listings over and over
>again can really get stuck on this. But I dont think its an appletalk
>problem

>The drive runs ext3 with a journal , 7200rpm , snappy etc.

Reiserfs does much better with large directories.

>The box is a debian woody with 'bits of sarge and backports mashed in'
>running a 2.6.8-2-k7 kernel.

>Any idea what might be the cause of this? It really causes alot of
>problems.

ext3 doesn't like lots of files in a directory. It has a "classical"
idea of how filesystems should be managed.

Reiserfs has always been a database pretending to be a filesystem.
With version 4, that's even more evident because "arbitary"
meta-information can be attached to each file. The meta-information
is maintained by the filesystem, so for instance, one could have a
news spool filesystem with different "views" on the same article;
and meta-information could relate to article headers; akin to news
overview.

That makes management of large numbers of files of a particular type
especially interesting as most of the management of the new spool
would be handled transparently by the filesystem. To post for
example, one could simply copy the article onto the filesystem and
it would "magically" appear in the right places.  News overview
information would be visible locally by simply looking at a
particular type of meta-information.

But I digress...

ext3 takes so long because ls has to read the entire directory which
may be horribly fragmented and suffer from indirection; and then sort
the list. "ls -f" should list names as they are encountered.
-- 
/"\ Bernd Felsche - Innovative Reckoning, Perth, Western Australia
\ /  ASCII ribbon campaign | I'm a .signature virus!
 X   against HTML mail     | Copy me into your ~/.signature
/ \  and postings          | to help me spread!




More information about the plug mailing list