[plug] ridiculous UNIX one-liner

Nima Talebi nima at it.net.au
Thu Jan 30 21:43:50 WST 2003


----- Original Message ----- 
From: "Craig Ringer" <craig at postnewspapers.com.au>
To: "Perth Linux User Group" <plug at plug.linux.org.au>
Sent: Thursday, January 30, 2003 12:25 AM
Subject: [plug] ridiculous UNIX one-liner


> This is why I love unix:
> 
> Task - identify sets of identical files in a collection and print out 
> groups of identical files.
> 
> Command (one line *grin*):
> Note that the first cut -d '     ' must contain a TAB, enter it using 
> CTRL-V then hit the tab key. (is there a better way to do this?).
cut -d "\t"

> 
> for SUM in `find -type f -exec md5sum "{}" \; | tee /tmp/proglog | sort 
> | uniq -c -w 32 | sort -n | egrep -v '^[ ]+1' | cut -d '        ' -f 2 | 
> cut -d ' ' -f 1` ; do grep $SUM /tmp/proglog ; echo ; done | cut -d ' ' 
> -f 2 | tee /tmp/nonunique_file_groups
> 
> Neatened up for easy reading but (probably)
> 
> for SUM in `find -type f -exec md5sum "{}" \; \
> | tee /tmp/proglog \
> | sort \
> | uniq -c -w 32 \
> | sort -n \
> | egrep -v '^[ ]+1' \
> | cut -d '        ' -f 2 \
> | cut -d ' ' -f 1`
> do grep $SUM /tmp/proglog
> echo
> done | cut -d ' ' -f 2 | tee /tmp/nonunique_file_groups
> 
> Cool eh?
> 

Nima



More information about the plug mailing list