Two Questions about Collections (Dup File collections)

First, the easy question. I read through the help file and could find no reference, so I doubt it's possible, but can one save the results of a "Find Duplicate Files..." run? (It was an all night run as I had it search through literally half a terabyte of data, and there were tens of thousands of duplicates [as I expected], but I don't want to blindly push the delete button. I need time, i.e., days or weeks, to go through the list/collection.)

Second, the search directories don't seem to be respected. Is this a known bug?

Scenario: I have three drives full of speech data (I'm in Voice Recognition, and the data is used for training and testing purposes). The base directories are identical, and this is what I had in the "Find in:" box:

E:\Media\speechdata
D:\Media\speechdata
G:\Media\speechdata

The thing is, in the final "coll://Duplicate Files" collection, there are numerous entries to files from other directories under the Media directory, for instance:

G:\Media\tmp\UK-test\D98202323.wav
E:\Media\junk\test\data\phenoms.wav
etc.

Bottom-line, instead of searching only .\Media\speechdata and below, DOpus search the parent directory (.\Media) and below. This has to be a bug, yes? (I'm currently using DOpus 9.0.0.6.)

Zach,

Can probably answer your 'Easy' question - Collections are retained by DOpus between sessions, so unless you run Find Duplicate files again you should find it is still there even after shutting down.

Also if you navigate to the DOpus program folder you should have a sub folder called collections - you could create a backup file of the Duplicate Files.col file.

Third alternative is to actually create a new Collection (Call ir whatever you want) and then re-run the Duplicate Files routine but change the 'Show Results In' field to your newly created Collection.

Hope that helps (sorry not sure about your other issue).

Cheers

Phat

Re. Your second issue - had a look at it , but can't replicate it. DOpus only searches for duplicates in the Folders I specify, it doesn't search the parent directories.

What OS are you using ? My testing was on XP only. It is possible that with Vista, DOpus may be following Hard links or Junctions (but I'm no expert).

Did also try creating shortcuts to the parent directory in the Folders being searched, but DOpus ignored these.

Can I just confirm you have 'Clear Previous Results' ticked.

Sorry mate, not much help.

[quote="phatman"]Also if you navigate to the DOpus program folder you should have a sub folder called collections - you could create a backup file of the Duplicate Files.col file.

<--snip-->

Hope that helps (sorry not sure about your other issue).[/quote]
Well, I found it under...

C:\Users\Zach\AppData\Roaming\GPSoftware\Directory Opus\Collections

(This under Vista (x64), so I suppose the location may change depending on what OS DOpus is being used on.)

Anyway, yes, that helps; in fact, that's the best answer I could have hoped for. Thanks!! :slight_smile:

About the directory search problem, I've been thinking about it, and what might have happened is that after I manually added the three directories to the group box, I might have browsed to the parent directory which would indirectly add the directory to the search list.

Like I stated, though, it's an all-night endeavor to test, so I haven't had time to explore this possibility. So to Jon (and the other devs), unless I report back, disregard this "bug report" and just consider it operator error. :sunglasses:

You don't need to mess with the .col files directly, just make a copy of the collection itself (for example, go to coll://, select the collection, and then press ctrl-c and then ctrl-v).

You can also rename a collection, so you could rename the Duplicate Files collection to something else, and then a new one would be automatically created next time you did a search.

That's almost certainly it. Opus will search all the directories (if any) added to the list that is directly below the path field/drop-down, as well as whatever is in the path field/drop-down.

Okay, well, unfortunately, it actually is a bug of some sort.

I reran the search/dup-finder query again last night, this time making 100% certain that in the combobox were only...

D:\Media\speechdata
E:\Media\speechdata

And in the edit field was...

G:\Media\speechdata

...when I clicked the "Find" button.

I pretty much left the computer alone after that and went to bed, but--and this is important--I might have clicked around my directories in the other pane for a bit (I operate DOpus in "Commander" style, dual-pane, but no tree) looking at some codecs I just installed.

I say that because this morning, the directories were pretty much respected, BUT, 1.) E:\Media\speechdata was in the edit field instead the G:\Media\speechdata, and 2.) in the results were no duplicates from G: (for the proceeding reason I guess) and there were several duplicates from a completely different drive!!:

H:\bin\multimedia\Windows Media Components\Encoder\Settings

Now, I was probably viewing the H:\bin...\Encoder directory before I left the computer, so what I think is happening is that Duplicate Finder task is dynamically updating its directory list even after the search commences.

In the event that maybe this is a feature and not a bug, it just occurred to me that this time, I did have that "Lock Folder" icon (above the "Find in:" edit field) in its unlocked state.

Could that be the reason why files from a completely different drive (that I was probably browsing) showed up in the search results?

Edit: So you won't think me daft, the reason I "unlocked" it was because I thought that meant "display results in a separate lister" or something as opposed to locking the lister down and displaying the result inline. :slight_smile:

It should only matter which folder you're in when you start the search. After that I don't think it should matter.

Did you have the Clear previous results checkbox selected? If not then that would explain things.

Yes, I had the "Clear Previous Results" option checked.

Since my previous post, I reran the query on the same three drives, but this time went down another directory (.\Media\speechdata\2004\Q3) so the search only took 20 minutes instead of hours. (I'm doing a full-blown MD5 comparison, mind you, on thousands of files.)

With the lock icon in its locked state, after I clicked "Find," I explicitly traversed to that H:\bin...\encoder directory in the other pane. The "Find in:" edit box kept updating with my directory clicks which I found disconcerting, but the final results did, this time, remain constrained to just those three .\Media\speechdata\2004\Q3 directories.

I don't know what more I can say.