[plug] evil performance on large directories. [RANT]

Cameron Patrick cameron at patrick.wattle.id.au
Tue Apr 5 21:27:33 WST 2005


Bernd Felsche wrote:

> ext3 takes so long because ls has to read the entire directory which
> may be horribly fragmented and suffer from indirection; and then sort
> the list. "ls -f" should list names as they are encountered.

Hmm.  Without knowing anything about the internals of either
filesystem, judging purely on the performance that I've seen on
moderately large directories (tens of thousands of files), I believe
that this is the case with reiserfs too.  In fact I'd appreciate it if
someone who has some clue about how filesystems _actually_ work could
corroborate what I've written below or LART me for it if I'm wrong...

Just getting a directory listing can be fairly fast, just reading a
list of names from the directory, which is typically on the order of
hundreds of kilobytes.  However, most graphical file managers -- such
as, presumably, the Macintoshes which Shayne has -- are interested in
more information than you can get from just the directory (usually
just the file name and inode number of the file on a Unix filesystem).
To get interesting information like the size of the file, file
permissions and so on, you need to use the stat(3) system call.  This
is slow, as in a typical Unix filesystem (including, I suspect,
reiserfs, since it isn't too flash at this either) the information
needed for stat is usually stored on a _different_ part of the disc -
often a different area of the drive for every file.  This means that
rather than the nice sequential reads that your hard drive likes,
it'll be seeking all over the place.

In one directory I have with ~40k files, `ls -l >/dev/null` takes
about a minute and a half (cold cache).  Just getting a list of the
files in the directory takes less than 5 seconds.  This was on my
laptop, with reiserfs.  YMMV.

Reiserfs does have an advantage over ext3 when accessing any
_particular_ file that you already know the name of as it uses a more
cunning data structure than just "a whole bunch of file names, one
after another" to store directory information.  I would imagine, but
have not tested, that reiserfs would be roughly O(lg n) rather than
O(n) to look up a particular file if the whole directory is cached;
i.e. for a directory with tens of thousands and files this may be the
difference between looking at a dozen or so entries rather than all
10,000 in the worst case.  However, unless you routinely need to do
something pathological like opening _every_ file in a directory with
tens or hundreds of thousands of files, I don't think that you'd
really notice a big difference.

Ext3 has a option that enables a different algorithm (but still a
cunning one!) for storing directory entries, if you want to give your
otherwise hum-drum filesystem a spoiler and some go-faster stripes.
I'm pretty sure you need at least a 2.6 kernel for this to work, but
it may also be backported to recent 2.4 series.  The incantation to
enable it is:

    $ tune2fs -O dir_index /dev/hda3

where /dev/hda3 should be replaced with the appropriate device.  It'll
only take effect to any directories which you create _after_ adding
the dir_index option.

Here are some benchmarks which suggest that reiserfs and ext3 with
dir_index may actually be _slower_ than plain ext3 in some cases:

    http://www.thesmbexchange.com/eng/qmail_fs_benchmark.html

However, as usual, don't assume that what that chap found will apply
to your own situation until you've actually tested it for yourself.
Also, he recommended xfs, which in my experience is a good way to lose
data and thus not worth using for _any_ performance gain; but that's a
separate rant.

Wheeee, that turned into quite a rant.  I hope that some of it was
relevant to someone :-)

Regards,

Cameron.





More information about the plug mailing list