Feature Request: duplicate file find - improved selection

Think the below suggestions are not yet available, maybe something for future updates.

Add a few selection-criteria when searching for duplicates.

Like: keep longest filenames
and/or longest paths.

Mostly long filenames explain the contents of the files, users often have already renamed such files adding some
information to the name. When reorganizing things, the original -shorter- filenames may be copied back
(e.g. by a restore).

Example - duplicate search (delete mode) finds flwg files:

[_] short-filename.pdf d:\folder
[X] Long filename with info about contents.pdf D:\folder\subfolder\specialfolder

The 'specialfolder' has been created on earlier re-organizing actions.

Obviously, I would like to have the 2nd one to be kept and/or files that are stored in long folderpaths
instead of keeping them in a 'rootfolder'.

On conclusion, add:
Longest or shortest paths
Shortest or Longest filename
Newest or Oldest date

When deleting: add option to move to a specific folder (instead of the recycle bin), keeping tree-structure

TIA

This is something that would help my tasks a lot as well! I strongly second all of the above.

You can already make it choose which files to keep/delete based on the oldest/newest file. (Sort by that column, then click Select in the duplicate finder when in Delete mode. Or sort by that column first before starting the duplicate search.) The kept file is always the first one of each group, and that is determined by the sort order.

As for choosing which files to keep/delete based on the longest/shortest names/paths, that seems like a strange criteria.

If you just want to favour files below one starting location over files below another, enabling and sorting by the Location column already allows you to do that.

If you really want to choose files based on a count of how many letters are in their paths, could you explain why that is? Having a longer/shorter path/name sounds like a side-effect of some underlying criteria, rather than the primary reason for wanting to keep/delete a file. If you explain the reasons maybe there is a better way to achieve them (or maybe you'll convince us that there is a need for what you're asking for).

That is assuming you are both asking for the same thing(s). :slight_smile:

It might be tricky to explain. I do very large duplicate searches in order to merge file libraries. There are also updated versions of the files with the same names. In the extreme example I have two archives of folders that have thousands of subfolders and up to 200,000 files in them (no dups). What I want to do is compare a new folder full of newly arrived files that might have 1,000-10,000 files in it with one or both of those archives. Usually the folder I want to compare with the archive is all in one folder, but sometimes in hundreds of subfolders too. Then i want delete all the duplicates in the new folder tree. Then finally merge the new folder tree with the archive. (This is just one example of other permutations).
If the new folder tree is in a short path then having the search automatically select files on this basis would save time.

The power of dopus is its ability to handle files on this type of scale.

The idea that "when deleting: add option to move to a specific folder (instead of the recycle bin), keeping tree-structure" is important so that these files can be archived elsewhere in case of error.

Your suggestion of sorting by the path column first before starting the duplicate search does not seem to work, as the search seems to over-ride this. Doing such a sort after the search means re-selecting all the files tagged for deletion.

Actually there are other criteria that could be usefully added to the original request. eg options to select the file in the shortest path, unless the date is earlier than the other duplicate (then select the other).

So by the sound of if you don't really care which path is longer; you just want to favour keeping the files in one folder over the ones in another, and the length of the paths is one thing that is different about the two folders.

So sorting by location should do everything you want.

Note: LOCATION, not name. There is a location column which you may need to add first.

If the format or sort-order is changing when you generate the results, it's probably due to coll://Duplicate Files having a format defined for it, or due to a more generic Collections format being defined. To deal with that, go to the collection and set it up with the columns and sort order you want, then save that (via the Folder Options window, Save -> For This Folder).

I have mine set to sort by Name, then Location. (Click Name, then ctrl+click Location to sort by two columns at once.)

If you have the Duplicate Finder in "delete mode" then all you have to do is click the Select button, between the Find and Delete buttons, and it will do that for you.

Edit: just for good order's sake, my here, just crossed the previous replies, so i have not read them...

As an example

say I once created a screencapture file named:
F:\JPG\Dopus-setup01.jpg

This file went to my backup-drive.

Later I did some reorganizing/renaming on the source drive, the same file was changed to:
F:\JPG\Software\Directory Opus\Settings\Dopus v10-Settings-Pref-Foldertree-Appearance-230711.jpg

After backing up again (or restoring the first file, same effect), I have two identical files on my harddrive.

My sugestion is that Duplicate File find (MD5) will optionally show both files like:

[X] F:\JPG\Dopus-setup01.jpg
[_] F:\JPG\Software\Directory Opus\Settings\Dopus v10-Settings-Pref-Foldertree-Appearance-230711.jpg

i.e. Director Opus to automatically mark the 1st file to be deleted, as the 2nd file clearly shows what it is
about and has been stored in the right folder.

BTW I did some test on sorting on location, but I think it is not consistent on this. Dopus is often marking
the files stored in the longest path and with longest filename to be deleted.

The other suggestion is to have an option to MOVE deleted files to a user defined (temporary) folder,
whilst keeping the tree structure of original location, rather than to actually delete them.
The idea behind this is, when users, after a while, find out that they need to restore some accidentally
deleted files and folders, they can still do so.
A kind of temporary Recycle Bin, so to say.
I discovered this to be quite useful (some dupfinders have this option)

Given the above example the first file then would be deleted to
x:\DeletedDups\F\JPG\Dopus-setup01.jpg or
x:\DeletedDups\JPG\Dopus-setup01.jpg

Thanks

[quote="leo"]So by the sound of if you don't really care which path is longer; you just want to favour keeping the files in one folder over the ones in another, and the length of the paths is one thing that is different about the two folders.
So sorting by location should do everything you want. Note: LOCATION, not name. There is a location column which you may need to add first.
If the format or sort-order is changing when you generate the results, it's probably due to coll://Duplicate Files having a format defined for it, or due to a more generic Collections format being defined. To deal with that, go to the collection and set it up with the columns and sort order you want, then save that (via the Folder Options window, Save -> For This Folder).
I have mine set to sort by Name, then Location. (Click Name, then ctrl+click Location to sort by two columns at once.)[/quote]
Thank leo. This looks a promising strategy. I will try it. Your detailed instructions help a lot.

..so far so good on the above. When these techniques are all combined, its not so bad compared with before...

New (on topic) question. When I edit a file within a Duplicate files list (ie change the name of one to match the other, before continuing to delete the other) - oddly, the edited file 'disappears' from the list? Where did it go and why? It occurs with either the checked file or the unchecked file. Has it simply been removed from the collection? Does dopus assume that since I edited the filename that I would want to keep it, so took it off the collection list? If so, its odd, but cool... Odd because thats what the checkboxes are for? What if i want to edit again?

Renaming a file in the duplicates appears to move it out of its original group and into an "unspecified" group.

I guess that makes sense if the duplicate search included filenames in the criteria (since renaming the file invalidates it (and the rest of the results, one could argue)), but I'm not sure it makes sense otherwise (since, in other cases, the name doesn't affect how the file should be grouped).


oh, yeah. There they are, huddled down at the screen bottom. As you said, there is not much sense, when I use only a checksum search, not name search. But no harm either of course. At least I can find them when I need - they have not actually gone "poof". Thank.

[quote="mrwul"][X] F:\JPG\Dopus-setup01.jpg
[_] F:\JPG\Software\Directory Opus\Settings\Dopus v10-Settings-Pref-Foldertree-Appearance-230711.jpg[/quote]

In that situation, sorting by location should do the job. Anything directly below F:\JPG will have a location that sorts before anything in a folder below F:\JPG. (If you want the opposite, reverse the sort-order and click Select.)

[quote]BTW I did some test on sorting on location, but I think it is not consistent on this. Dopus is often marking
the files stored in the longest path and with longest filename to be deleted.[/quote]

Sorting by location won't make it select the files with the longest path. It will select them based on the alphabetic order of the paths.

[quote]The other suggestion is to have an option to MOVE deleted files to a user defined (temporary) folder,
whilst keeping the tree structure of original location, rather than to actually delete them.[/quote]

Use Edit -> Select Other -> Checkboxes to Selection and the items tagged for deletion will now be selected. You can then move them somewhere else as you would any other selection of files. (You could make a button/hotkey to move them to x:\DeletedDups if you want.)

[quote]

[quote]BTW I did some test on sorting on location, but I think it is not consistent on this. Dopus is often marking
the files stored in the longest path and with longest filename to be deleted.[/quote]

Sorting by location won't make it select the files with the longest path. It will select them based on the alphabetic order of the paths.[/quote]

Am afraid this won't work, I shd then manually do a lot of (un)tagging. My idea was that Directory Opus wud do that job.. :wink:

[quote]

[quote]The other suggestion is to have an option to MOVE deleted files to a user defined (temporary) folder,
whilst keeping the tree structure of original location, rather than to actually delete them.[/quote]

Use Edit -> Select Other -> Checkboxes to Selection and the items tagged for deletion will now be selected. You can then move them somewhere else as you would any other selection of files. (You could make a button/hotkey to move them to x:\DeletedDups if you want.)[/quote]

eh...
I've got Select All and Select under edit..
no Select Other :confused:

==

Why won't sorting by location work?

[quote]eh...
I've got Select All and Select under edit..
no Select Other :confused:[/quote]

The thing I mentioned is there in the default toolbars. You must be using a custom one, but you can copy it over from the default one if you need to.