Possible bug in delete duplicate files finder

I've haven't found the exact number, but twice today I've tried to delete duplicate files but encounter a "can't find file" error. The second time the number of files in the duplicates collection was approx 14,300 (from someone's poorly kept hard disk).

It seemed to work fine with a much smaller collection (approx 250 files).

I'm thinking that there's possibly a bug with handling such a large number of files.

I noticed that while running this that DOpus used around 2.4Gb of RAM which is not a huge problem for me as I have 1.75Gb of RAM in my machine (Athlon 3200 XP) so some was paged - but the main issue is that although it found all the duplicates quite quickly, I was then unable to delete them due to the "not found" error.

Can we have a fix please? Thanks.

Kulwant

Just adding a bit more info. The exact error message is:

Error Deleting file or folder

Cannot delete file:

Cannot read from the source file or disk.

Also, shift clicking the Delete button in the duplicates finder doesn't do a straight delete as I read in these forums elsewhere (unless I got the wrong end of the stick). And the Delete button on the toolbar only seems to delete individually selected (manually) files. There seems to be no way of saying "Select all ticked files" - or did I miss it?

Thanks.

Kulwant

If the Opus process was using 2.4gig of RAM then I'd say there may be a problem since the current version of Opus is a 32-bit process which means it can generally only access up to 2gig. It could be that some of that memory is in use by the OS -- seems very likely as otherwise it wouldn't be able to get that extra 0.4gig -- so it may not have hit the limit, but if the limit was reached then it would explain the odd behaviour.

The real question is probably why so much memory is being used in the first place. Were you using thumbnail or tiles modes? With a lot of files they can result in a lot of memory being used. Just displaying a simple list of 15,000 files shouldn't use that much memory (I do that fairly regularly in fact, when browsing my music folder in Flat View mode), but I don't know if using the duplicate finder increases the memory overheads; I've never run it in a situation that would generate that many results.

If you want to select the files which are checked just push Shift-Space.

Ok, after a little more detective work, I think I've found the problem. It is not the number of files that causes the problem but certain character combinations in the filename. It is simply the likelyhood of coming across the problem characters that goes up the more files there are.

I suspected it was the ampersand - i.e. "&" but it wasn't. i.e.
H&P Invoice.rtf
was handled ok. But

c[1].5&tz=60&r=blockedReferrer&title=Welcome%20to%20AOL%20UK&cd=32&ah=738&aw=1024&sh=768&sw=1024&pd=undefined

wasn't (there were only two files in the duplicates collection) and triggers the error mentioned in my original message.

As someone else said, it would be good if DOpus put up a "Skip/Skip-all/Abort" requester instead of giving up totally like Windows Explorer does.

DOpus' ability to gracefully recover and proceed from unexpected scenarios was one of the reasons I bought DOPus in the first place.

At least on Vista, I can create a file with that name and then find and delete it using the Duplicate Finder without any issues.

Is it definitely just the filename? If you create two empty files in a new folder and name them like that (or similar), do you get the problem when trying to delete them via the Duplicate Finder?

[quote]As someone else said, it would be good if DOpus put up a "Skip/Skip-all/Abort" requester instead of giving up totally like Windows Explorer does.

DOpus' ability to gracefully recover and proceed from unexpected scenarios was one of the reasons I bought DOPus in the first place.[/quote]
Opus will let you skip errors when deleting, but only when not using the Recycle Bin. If Opus is configured to use the Recycle Bin then it calls the same API which Explorer calls and you get the same (crappy) error handling. Unfortunately there is no other documented way of deleting files the Recycle Bin.

I'm trying to get around this for now by trying to tell DOpus to show all duplicates but exclude any filenames that contain a square bracket e.g. "[".

I tried this using the filter with

Name Match *

Subclause No Match

                Name   Match   *[*           (Use wildcards ticked)

but that doesn't seem to work and files with "[" are still listed. What am I doing wrong?

OK, cracked the filter thing. I changed subclause to no match name match /091 (which looks for a left square bracket - i.e. "[" using the Decimal ASCII value).

So it appears the "Cannot read from the source file" error is caused by the presence of square brackets in the filename.
All the duplicate files left in the folder I was searching in have square brackets in the filename and DOpus refuses to delete them.

Will this need a fix or is there a workaround?

Sorry missed your reply - was too busy testing and posting! (on a related note, I've got the "notify me when a reply is posted" ticked but I just realised the reply notifications were being caught in my ISP's SPAM filter!).

Now this is really strange - I just tried what you suggested by creating a file called test[1].txt and the duplicates finder in DOpus managed to find and delete them!

Back to the drawing board....

OK. I've got two filenames which refuse to delete:

aHR0cDovL3d3dy5hcmlhaG9zdGluZy5jby51ay9hc3NldHMvSU1HL2FscGhhL0Z1bmNoYWwvRnVuY2hhbC1Nb250ZUNhcmxvLUV4dGVyaW9yLmpwZw==-sm[1].jpg

aHR0cDovL3d3dy5ob3RlbGJlZHMuY29tL2ltYWdlcy9ob3RlbHMvTUFERUlSQS9FU1RBTEFHRU1QT05UQURPU09ML0VTVEFMQUdFTVBPTlRBRE9TT0xIQUIuSlBH-sm[1].jpg

Both are buried in a folder whose full path is 152 characters long. This means that the fully qualified filename (i.e. including the full path) is 278 characters long. `The only thing I can think of it does DOpus have a 255 character path/filename limit?

Have you tried deleting the same files in Explorer?

Last time I looked Explorer had the same limitation. There is a way around it, which perhaps Opus should use (though when deleting to the recycle bin the limit may still come into play; not sure there), but I don't think many programs use the workaround so extremely long paths will cause problems with a lot of things, including Explorer if I remember correctly.

I don't know about long filenames, but I have run into cache files (and coincidentally, that's what those look like) that used invalid characters that, as a result, could not be deleted in Explorer.

The solution is just to open up a DOS prompt and delete it via its 8.3 filename. (You can see what its 8.3 name is just by doing a "dir /x" for the directory of interest.)

Yes, Explorer had problems handling these too. The context menu was rather small when right clicking on the files.

I got round it by going up the tree and deleting the grandparent folder.

So is there a way to hunt for long/problem filenames using DOpus?
These files cause problems when backing up to DVD also.

If the path to the file does not exceed 255 chars, and it's just the path length PLUS filename char length that blows it out, then I thought you could rename the file (to drop the total path + filename length below the limit) in order to get around stuff like this?

No for some reason Windows Explorer doesn't let you.

But it would be good to have some way of searching for really long path+filenames like this BEFORE starting a backup operation onto DVD. It's no fun learning after it's started burning the Disc as that in effect is a wasted disc.

I know I could create an image file first, but a search followed by a manual resolve (rename/delete whatever) would be better/quicker than having to create an ISO for each backup.