I'm trying to make a column showing file/folder sizes, but I've encountered a strange behavior — when I can only get a file size for a soft link when it is inside a folder, but not when it's outside.
For example, assume I have a folder with two identical files (different names given here for clarity) like this:
...
Folder/Soft-link1.txt
Soft-link2.txt
when I'm inside this folder I can get the size of Folder, but not the size of Soft-link2.txt. When I enter the Folder then I can't get the size of Soft-link1.txt
Can't understand what's going on as it seems that the output item is identical in both cases.
This line looks wrong to me, although I don't know if it's causing the problem:
var fileSize = totalSize = DOpus.FSUtil.NewFileSize(0);
You would probably want to create a separate size object for each variable, rather than have both point to the same object. Unless you're doing something special there.
Is Folder a soft-link as well, or can it be summed up as: You can't get the size of soft-links in any folder?
If needed, you can use FSUtil.Resolve(item.RealPath,"j") to resolve junction and links to their targets. From there you could construct a second item out of the obtained path, and get the size of that, if the aim is to get the size of the target.
Nothing special, just tried to shorten the script a bit and googled this as a way to assign identical values to multiple vars. Changed to one per line, the issue is not resolved
No, it's not, it's a regular folder.
Don't think so since the folder enumeration method is able to produce an item I can get the size of. So I only can't get the size of soft-links with the first if (!item.is_dir) { method for some reason, i.e. when these soft-links are located in the same folder I'm currently in, but it works fine in sub-folders
Thanks, will check this out. By the way, how is it done in the default Size column? There it treats a soft link just like a regular file (I wish there were code for the default columns, would make tweaking them much easier)
Is there a way to limit this resolving only to soft-links (does it matter performance-wise?)?
Do you think I should use the same trick for the folders and folderEnum = DOpus.FSUtil.ReadDir(item, true);? Might there be some tricky issue that might bight me in the future?
Also, for my main script I'd like to also be able to do the opposite — to ignore sizes of soft-links, including those located inside folders (which are currently counted). What would be the best way to do this in a script?
FYI here is the update test script
function OnColumnsMain(scriptColData) {
var ColName = "SizeTest";
var fileSize = DOpus.FSUtil.NewFileSize(0);
var totalSize = DOpus.FSUtil.NewFileSize(0);
var item = scriptColData.item;
Debug("item=scriptColData.item{item.size}= " + item + "{" + item.size + "}");
if (!item.is_dir) {
fileItem = DOpus.FSUtil.GetItem(DOpus.FSUtil.Resolve(item.RealPath,"j"));
fileSize = fileItem.size;
if (scriptColData.columns.exists(ColName)) {
scriptColData.columns(ColName).value = fileSize;
Debug("File: fileItem {fileSize} = " + fileItem + " {" + fileSize +"}");
}
}
if (item.is_dir) {
var folderEnum = DOpus.FSUtil.ReadDir(item, true);
while (!folderEnum.complete) {
var folderItem = folderEnum.next;
Debug("Folder: folderItem=folderEnum.next {folderItem.size} =" + folderItem + "{" + folderItem.size + "}");
if (!folderItem.is_dir) {
totalSize.Add(folderItem.size);
Debug("Folder: totalSize.Add(folderItem.size) = " + totalSize);
}
}
if (scriptColData.columns.exists(ColName)) {
scriptColData.columns(ColName).value = totalSize;
Debug("Folder: item {totalSize} = " + item + " {" + totalSize +"}");
}
}
}
I think ReadDir may do some extra processing of junctions/links, as a side-effect of other places it is used. We should probably make the other places that create Items consistent there.
You can test if an item is a junction or link like this:
var dir = DOpus.FSUtil.ReadDir("C:\\Users\\Leo\\Desktop\\New Folder");
while (!dir.complete)
{
var item = dir.Next();
var isJunctionOrLink = (item.attr & 1024) != 0;
DOpus.Output(item + ": " + (isJunctionOrLink ? "is junction/link" : "is normal"));
}
1024 is FILE_ATTRIBUTE_REPARSE_POINT from File attribute constants. That isn't documented in the Opus manual's Item page, but we'll add it. We'll also update the FileAttr helper object to provide an easy to use named property for testing if the attribute is there, instead of having to use 1024.
Yeah, maybe ReadDir is doing something invisible to me (since item name I see is identical, just one has size property of zero).
Thanks for the junction check code!
I've got it working for "direct" softlinks, i.e. files and folders that I get directly as an item from the column or enumerated by FSUtil.ReadDir. However, ReadDir (in recursive mode) also enumerates files inside softlink folders, so I end up counting those. Is there a way to exclude those as well?
Below is the part of the script for folders that I'd like to exclude files/folders within softlink folders from (IsLinks is a variable that is false if I don't want to include softlinks into folder size calculation):
if (item.is_dir && ( IsLinks || (!IsLinks && !isJunctionOrLink) )) { //folder unless Junction with IsLinks option disabled
var folderEnum = DOpus.FSUtil.ReadDir(item, IsRecursive);
while (!folderEnum.complete) {
var folderItem = folderEnum.next;
if (IsLinks) { //add softlinks
if (!folderItem.is_dir) {
folderSize.Add(folderItem.size);
}
} else { //exclude softlinks
var isSoftLink = (folderItem.attr & 1024) != 0;
if (!folderItem.is_dir && !isSoftLink) {
folderSize.Add(folderItem.size);
}
}
}
That sounds complicated. Is there a way to expose the function that DOpus already uses in its Size column with the Ignore junctions and softlinks when calculating folder sizes option enabled?
I also had an idea that might be easier to do: to check for softlinks only in the folder paths between my tab and each file in the FolderEnum (and don't add a file size to the total if any folder is a symlink)
For example, consider the following path inside a regular (not a softlink) folder C:\Tab
when I DOpus.FSUtil.ReadDir("C:\Tab", true) I get three items in the FolderEnum. The first item is excluded as it's a direct softlink itself, the second is excluded as it's not a file, but for the third item (C:\Tab\SoftlinkFolder\NormalFolder\file.txt) how would I "extract" all the paths between this file and my current tab C:\Tab (in this case it would only be two paths)?
C:\Tab\SoftlinkFolder
C:\Tab\SoftlinkFolder\NormalFolder
...so that I could check each of these and exclude the file becase the first check would return positive for a softlink.
Doing the recursion yourself is really easy. Just move your code that handles each file/folder into a function (or two functions, if easier: one for files and one for folders), and call that on each file/folder, starting with the selected ones (or whatever your current starting point is).
You've already got code to call ReadDir once, and it's the same code to read a child dir (just change the recursive argument from true to false). You'd still only need one ReadDir call in the entire script.
In the next update we'll make the Item.size property correct for symlinks in all cases, and provide a way for FSUtil.ReadDir to skip over links when recursing.
I've done that and it's working (with recursive option I am able to replicate the default size column output with the option Ingore junctions ... folder sizes on and off).
However, it's way too slow to be useful — I have a test folder with ~4k folders and ~4k files and after ~10min this function would still not produce any output. Is it even possible to match the performance of a built-in with a script?
function OnColumnsMain(scriptColData, IsRecursive, IsLinks) {
...
item = scriptColData.item
...
if (item.is_dir) {
readFolderSize(item, IsRecursive, IsLinks);
if (scriptColData.columns.exists(colName)) {
scriptColData.columns(colName).sort = folderSize;
scriptColData.columns(colName).value = folderSize;
}
...
}
function readFolderSize(folder, IsRecursive, IsLinks) { //!!CHOKES on 4k folder list
if ( !IsLinks && isJunctionOrLink) { return; } //exclude softlinks when IsLinks is false
var folderEnum = DOpus.FSUtil.ReadDir(folder, false);
while (!folderEnum.complete) {
var folderItem = folderEnum.next;
if (!folderItem.is_dir) { //file
if (IsLinks) { //add softlinks
var isSoftLink = (folderItem.attr & 1024) != 0;
if (isSoftLink) { mark = Script.config.LinkMark;}
folderSize.Add(folderItem.size);
} else { //exclude softlinks
var isSoftLink = (folderItem.attr & 1024) != 0;
if (!isSoftLink) {
folderSize.Add(folderItem.size);
}
}
} else { //folder, pass through the same function unless it's a softlink
if (IsLinks) { //add softlinks
readFolderSize(folderItem, IsRecursive, IsLinks);
} else { //exclude softlinks
var isSoftLink = (folderItem.attr & 1024) != 0;
if (!isSoftLink) {
readFolderSize(folderItem, IsRecursive, IsLinks);
}
}
}
}
return folderSize;
}
Thanks, that would be nice, any indication of when that beta might be available?
Though given the performance issues I'm having with this script (see my comment above), would it be possible to just expose the results of the default size columns (ideally with a scriptable toggle for the option Ignore junctions and softlinks when calculating folder sizes)?
Ultimately, I just need to apply different formatting to the information that's already there and show it side-by-side at the same time (hence the need for a scriptable togge I can apply to a single column rather than a global option)
P.S.
Working on this script and modifying another script that adds internal support of the Everything service I had another great idea — given how blazingly fast Everything is, did you consider just quering Everything service for folder sizes instead of having to do the enumeration yourself? This would only work for sizes without following softlinks though, but still would be a huge improvement (now for root drives and other huge folders I need custom layout that disables size column)?
Haven't watied to the end, it gets way too slow to about ~1 folder per second near the #1000 and even slower after that. DOpus CPU use is ~17% and memory is ~160Mb (this is with debug on, so I can see every folder listed in the log in real time). Don't know why the following iterations are getting slower
Will leave it to the end to see how long it actually takes to complete.
There are no links in the folder, just a bunch of folders created via multiple copy&paste commands.
Also there doesn't seem to be any endless loop as each folder (as seen in the the log output) in the loop is unique and follows the list of folders, e.g. from
".e - Copy - Copy (4) - Copy - Copy - Copy - Copy" to
".f - Copy (22) - Copy - Copy - Copy - Copy" etc
By the way, how would I handle cyclic folder links correctly?
Here is the script with debug commands left in, I'm getting slow performance at the Debug("readFolderSize.Fold|Rec: folderItem folder&&IsLinks iterate{folderSize} = \n" + iterate + "|" +folderItem + "{" + folderSize + "}"); debug command while testing only one column with recursive option with followsoftlinks turned on
It should not get slower & slower the more things there are, unless you're running low on RAM or the filesystem is being slow (e.g. network drives, drives which are spun-down, or antivirus causing problems).
Can we see the full script so we can try running the same thing?
What's the memory usage and available memory on the machine like?
RAM total reservation is just 45% of 16G (DOpus is ~140M)
CPU total use is ~30-40% (of which Opus is ~18%, the rest is an active backup process)
filesystem is very fast (SSD), overall disk use is ~1% (same with Network and GPU, all ~1%)
antivirus doesn't slow down the same operation via other means (e.g. default folder size column), folders are empty, so there is nothing to check, and I only see antivirus checking files being backed up in Resurce Monitor, not the folders being enumerated in DOpus
Here is the script simplified to a single column but just as slow
I've also tested it by entering the folder (so there isn't even any recursion going on, just a bunch of empty folders passed to get the zero output) and it seems to be also very slow (also looks like progressively slow after a few hundred items)
And this is the folder I'm testing it with (reduced to 2k items from 4k, it slows down after a few hundred items with very slow speed around 1.5k) Sorting Speed (2K folders).zip (457.8 KB)