Trying to use collections to work with Big Data

I guess I'm at the stage were file storage is becoming a problem. The other week it was my system drive now it's my profile drive. Because of the size of the drives it takes some time to "flat file them without folders". When I clicked on something and asked to open it's location I didn't think that would mess up my going back to the previous results. But of course it did. And even if I had locked the format I don't think it would have retained the data. In hindsight it may have been better to open it in a tab but I just didn't. After doing this a couple times I decided it would be nice to be able to save these results, which take some time to create, until I was done solving my space problem. ie. the collection.

I was able to create a collection easy enough but it wasn't so simple to put the results into the collection. I attempted to select all and right click and then use the send to collection however some 13 hours later I'm still waiting for that context menu. I don't think this is the solution is there another way?

How is the collecting being generated?

Ones made via Tools > Find Files are saved already by default.

Just by going to the drive doing flat file without folders.

Use Find instead, if you want the results in a collection.

I started off tyring to use Flat File with no folders to get all my files on P:\ so I could see what was using up my space. That took about 30-45 minutes to get a result. But it was a result that couldn't be retained so because of user error I'd find myself having to start over.

That inspired me to try to save the results to a collection using send to. That ran for 1/2 a day and I still didnt' have a context menu and now I'm using find trying to accomplish that but the processing time is so much higher it seems faster to go back to using Flat File.

Something in find is incredibily slow. I did a simple search for files only on P:\ about 75 minutes later it had only gotten 1/6 of the way through the data.

The timer is on the right-side right in the middle.

In comparison Everything returned all files on P:\ in mere seconds some 608,000 files. I hadn't thought of using Everything because I've just never used it to return an entire drive usually thinking of it as a single file finder.

Still I'd like to be able to use "find" but I've always felt that it seemed slow even when I was searching a much smaller set of data. It's ability to use all kinds of restrictive or inclusive criteria certainly would be a plus.

I don't know exactly how DO's find works but when I was working we had SQL that we used against our databases on a mainframe. SQL is essentially a search but depending on indexes and other physical data structures we often had SQL that had problems performing. But SQL will tell you how it's accessing the data and you can figure out if you need a new index to be created or you need to change a omit clause to an include clause to speed things up there were various changes you could make to improve performance. I don't know if you have a way to watch or understand how find is accessing the data to improve the performance I hope someone can resolve this sluggish performance because it's an important part of the DO package.

There are specialized tools for this, like WizTree.

First, collections are XML-backed afaik and thus not very performant. Then, RMB caused all context menu handlers to inspect all the files in turn (or smth like this), which is going to take looong time.

Filesystem is i hierarchical database. If you use it against its design, it will be slow. Like cascade deletion from a poorly designed relational db.

1 Like