Eclectic

About programming and maybe more…

Archive for November, 2007

Finding duplicate files using md5sum

Posted by tread on November 16, 2007

Been a while since I posted something.

Anyway, as usual I’m driven by need. I’ve got zillions of dupes on my hard disk, thanks to my wife’s unique file management techniques – I hope she doesn’t read this blog entry! :)

And I didn’t like Fslint etc., wanted something simpler. So here is something I think will work fine – will try to poke holes in it before I actually test it out of course!

find . -type f -name "*" -exec md5sum {} \; > md5all.txt
cat md5all.txt | awk '{print $1}' | sort | uniq -d > dupes.txt
cat dupes.txt | while read line; do echo "--------------------"; grep $line md5all.txt; done


Should be fine I think.

Posted in Linux, Programming | Leave a Comment »