Folder Size Caching and Breadth-First Search

G'Day,

One thing I really like about dopus (and the feature that led me to it in the first place) is the folder size calculation. Unfortunately I have to have it turned off for the majority of my folders because there's just too many files in them, and sizing them takes way too long.

One way around this would be to cache the folder sizes somewhere after the initial run, and associate each folder's size with its last-modified date. Then on subsequent runs, instead of re-calculating it for every folder, just check the folder's last modified date and only recalc if it has changed.

Another feature i'd like to see is the option to calculate the folder sizes using a breadth first search (instead of the current depth-first search). This would allow me to quickly compare the relative sizes of folders as they'd all get done at once instead of having to wait for one particularly large folder (ie with lots of files) before reaching smaller, later folders.

Cheers,
Sam.

Caching foldersizes, using the last modified time to know if they've changed, sounds like a great idea to me, although it's not quite as simple as it may first seem.

When a sub-folder is changed it doesn't bump the timestamps on the parent folder. This means Opus would have to cache the sizes of only the files in each folder, and would still have to scan every directory and subdirectory to see if their timestamps had changed.

Opus could know that it can skip reading a directory if its timestamp has changed but it would still then have to check the timestamps of each sub-directory (which it would know about from its cache).

This could still speed things up quite a lot, so I think it's worth investigating. (Whether GPSoft think the same is up to them of course!)

For what it's worth, folder timestamps don't work that way on Win9x/ME so the caching would only work for Win2k/XP users. (But using Win9x in this day and age should be illegal anyway. Ahem!))

If the caching were implemented it would still be useful to keep the current options about when folder sizes are displayed/calculated, since it's often good not to have relative folder sizes dominating relative file sizes.

Yeah, I realise that the parent folders' modified timestamps don't get updated when a subfolder is modified, I assume the NTFS Last-Accessed stamp only updates on the immediate parent folder too...

However, I find in most cases where it takes a prohibitively long time to calculate a folder's size, it's due to that folder having a large number of files, not a large number of subfolders. I think this is probably the case with most people's hard drives, and it means that you're going to save heap of processor time, even if you have to check all the subfolders' modified stamps, by ignoring the files if none of them have changed.

Haha, I agree with you about win9x. I wonder if any dopus users actually still use it...?

And maybe an option to have different relative size scaling for folders and files would be handy too (different coloured bars for files and folders?).

Any comments on the breadth-first search idea?
I think this would be feature useful (esp. in conjunction with caching) so you can quickly tell relative sizes of folders without having to wait to find out that Program Files is 10x bigger than any others (a fact you probably knew anyway).

-Sam.

I'm not sure if there is a way to scan just for subfolders (ignoring files). I think that processing a folder whose timestamp has changed will take the same amount of time as it does now. (Except, of course, processing some of its subfolders may be skipped if their timestamps are unchanged.)

The API used for finding the contents of folders returns the size of each item so, apart from adding up some numbers, scanning a folder is the same as calculating its size.

I still think it's worth investigating your idea, though. Cutting out the calculation for large but rarely changed folders could be a good improvement if everything pans out.

Breadth-first calculation might be nice, although I'm not personally too bothered either way. That said, I agree it can be a pain waiting for Program Files to calculate sometimes. :slight_smile:

I found this: foldersize.sourceforge.net/ and decided that was a good enough excuse to revive this thread.

This program is a windows explorer plugin that now does exactly what I was talking about. Shame you can't use explorer columns in dopus (can you?).

Anyway, I wonder if someone wants to make this into a dopus plugin or something?

Anyone?
C'mon, it's open source :slight_smile:
(source: prdownloads.sourceforge.net/foldersize)

You can. It should be listed in the Special group of columns, at the bottom.

You can. It should be listed in the Special group of columns, at the bottom.

Ahah!

Good stuff, thanks :slight_smile:

Now if only the authors can get their folder size column to sort properly...