[plug] ridiculous UNIX one-liner

Bernard Blackham bernard at blackham.com.au
Thu Jan 30 09:51:39 WST 2003


On Thu, Jan 30, 2003 at 12:25:22AM +0800, Craig Ringer wrote:
> Task - identify sets of identical files in a collection and print out 
> groups of identical files.

GNU tools and cygwin... why limit it to *nix? ;)

> Command (one line *grin*):
> Note that the first cut -d '     ' must contain a TAB, enter it using 
> CTRL-V then hit the tab key. (is there a better way to do this?).

TAB is the default delimeter for cut(1) anyway, so in theory you
shouldn't need this.

> for SUM in `find -type f -exec md5sum "{}" \; | tee /tmp/proglog | sort 
> | uniq -c -w 32 | sort -n | egrep -v '^[ ]+1' | cut -d '        ' -f 2 | 
> cut -d ' ' -f 1` ; do grep $SUM /tmp/proglog ; echo ; done | cut -d ' ' 
> -f 2 | tee /tmp/nonunique_file_groups

There are sysadmins and there are sysadmins... :)

Bernard.

-- 
 Bernard Blackham 
 bernard at blackham dot com dot au



More information about the plug mailing list