Use Duplicate Files tool to find SIMILAR files

Jon posted some info that should be helpful:

[Is DOpus right for me?:: user-generated collection xml)