[plug] speed: find vs ls

Thomas Cuthbert tcuthbert90 at gmail.com
Fri Jul 29 02:15:42 AWST 2022


As a guess I'd say the excessive metadata syscalls are due to your -type
predicate and maybe the format string (find has a number of other fmt
parameters that reference stat info). It sounds like you have lots of
directories too; limiting the number of number of directories will reduce
the rate of dentry and metadata reads. squid does something similar to
group objects together with its L1/L2 cache_dier hierarchy.

Also do you need to hash the whole file? Seeing as you already have the
metadata in cache you could probably get a quick performance win by
comparing the metadata to a previous value or just only hashing the
metadata.

On Thu, 28 July 2022, 5:22 pm Brad Campbell, <brad at fnarfbargle.com> wrote:

> G'day all,
>
> An observation while I'm still playing with my sizeable set of backup
> directories.
> I've been adding a bit that creates a file of crc32s of the updated files,
> and then toying around with a script to crawl the drive and check them all.
>
> I started using find to give me a list of dirs that contain the files. It
> was spending a *lot* of time just creating the list. In fact it spent more
> time looking for the files than the subsequent iteration and check of each
> one.
> I must qualify that with the fact, I'm about 10 days into creating the
> crcs and most directories already have ~800 days worth of backups.
>
> The script run with :
> for j in `find . -maxdepth 2 -type f -name bkb.rhash.crc32 -printf "%h\n"`
> ; do
>
> Checked 170 directories with 0 errors in 0:00:34:58
>
> stracing find, it's dropping into each directory and performing a stat on
> every file. Some dirs have a *lot* of files.
>
> I thought about trying a bit of globbing with ls instead, and blow me down
> if it wasn't "a bit faster".
>
> The script run with :
> for j in `ls ??????-????/bkb.rhash.crc32 2>/dev/null` ; do j=($dirname $j)
>
> Checked 170 directories with 0 errors in 0:00:09:49
>
> I know premature optimisation is the root of all evil, but this one might
> have been a case of "using the right tool".
>
> Regards,
> Brad
> --
> An expert is a person who has found out by his own painful
> experience all the mistakes that one can make in a very
> narrow field. - Niels Bohr
> _______________________________________________
> PLUG discussion list: plug at plug.org.au
> http://lists.plug.org.au/mailman/listinfo/plug
> Committee e-mail: committee at plug.org.au
> PLUG Membership: http://www.plug.org.au/membership
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.plug.org.au/pipermail/plug/attachments/20220729/7e7638eb/attachment.html>


More information about the plug mailing list