[plug] Find similarly named files in directories

Timothy White weirdit at gmail.com
Sat Jan 13 12:32:00 WST 2007


On 1/13/07, Bernard Blackham <bernard at blackham.com.au> wrote:
> >> find| sed 's/\ /:::/g' |sed -r 's/.*\/(.*)/\0 \1/'|sort -i -k 2|uniq
> >> -i --all-repeated=separate -f 1| sed 's/[^ ]*$//' | sed 's/:::/\ /g'
> >
> > Was a challenge for golf originally an i'm bored. 138 chars is benchmark.
>
> Gah, I'll bite. 59 chars of perl, 74 chars altogether.
>
> find|perl -ne'm#.*(/.*)#;push@{$a{$1}},$_}foreach(%a){$#$_>0&&print"@$_\n"'

Nicely done

>
> And the only thing it'll break on is files with new-lines in them (yes,
> it's possible! The only things you can be guaranteed not to find in a
> filename are / and the NULL byte).

If you have a filename with a newline, you deserve to have our scripts break :P
Seeing as it's compact perl, how are you using the / to prevent spaces
from breaking it? I was using the last / to find the filename of the
file. Or don't spaces effect it, because it's perl and not using
fields? Hmmm, I think it's probably the latter from what I can read of
that perl.
Btw, I like the way it prints it out, with the all "extra" occurrences
of a file being indented.

I quick test shows...

tim at linjeni:/data$ find|perl
-ne'm#.*(/.*)#;push@{$a{$1}},$_}foreach(%a){$#$_>0&&print"@$_\n"'|sort|uniq|wc
-l
62692

$ find| sed 's/\ /:::/g' |sed -r 's/.*\/(.*)/\0 \1/'|sort -k 2|uniq
--all-repeated=separate -f 1| sed 's/[^ ]*$//' | sed 's/:::/\ /g'|
sort|uniq|wc -l
62686

Decently close!
Timing them gives... (Making sure that the find command isn't slowing
it down by using a input file for both scripts)
real 0.949      user 0.844      sys 0.092       pcpu 98.64
real 13.936     user 13.069     sys 0.220       pcpu 95.35

I think we all know which one is the perl. Amazing!!

And the differences in the files? This char >>�<< (not sure if it'll
email or not) is some strange escape code. My script seemed to ignore
all the files with that in the path.

Well done Bernard! Anyone want to try and beat 74 chars? :p

Tim
-- 
Linux Counter user #273956
Don't email joeblogs at scouts.org.au


More information about the plug mailing list