Access to involved items in ScriptColumnData

When Opus calculate a script column value (for display, filters, etc.), does it already know if there are other items waiting to calculate the same value? I imagine this is true when the value is calculated for viewing, at least.

An idea that would greatly improve the time and logic used for the calculation (for columns where it's more efficient to get the value for multiple items at once) would be to add an items property to ScriptColumnData, populated with all related items.

The current workaround (using ScriptColumnData.tab to access tab items) doesn't work very well.

  • Sometimes the items list is incomplete.
  • Each time a column has to be calculated, you can't tell which items already had their value calculated previously (in another call, e.g. when expanding a folder).
  • It's not available when the value is used in filters, rename, etc.
  • You could use some kind of cache, either on disk (which, IMO, is much worse) or in memory, but that creates new problems like not being able to show recent changes automatically, since you wouldn't know when is a good time to clear it.

The script caching data it might need again makes the most sense, I think.

You can check cache vs file timestamps and sizes to detect if the cached data is still worth using/keeping.

So it doesn't know?

Although that's a possible option, it's really not very practical, IMHO it makes development more complicated and that partly works against the scripting system itself.

Also, in some cases (like this one in particular), those values don't depend on an intrinsic file value (changes to the file are irrelevant because the values come from another source, and it's faster to get those values in batches).

Ok, continuing the discussion.

I’ve noticed the following behavior when populating columns in multi mode (seems to happen in single mode too?). Correct me if this is a configurable option I’m missing.

Assume two columns in multi mode.
It basically works like this, and it kinda contradicts what the manual says. First, the method is called for all related items — that’s for the first column. If you simply return nothing, the method gets called again for the second column (not the first one), but this time it only involves the items currently visible on screen. So you can’t use any workaround to build a list of related items on the first pass and then, with that list in hand, assign values to each item on the second call.

So another option you could consider (this one has probably more benefits) is adding a new property for ScriptColumnData that acts like a "promise" when set to true, so the method would be called again for that particular item. If that existed, you could generate the list on the first pass, request a new call for the desired items, and on the second call (since you requested it) you’d already have the list and could do whatever you need with it. That would work whether it’s one column or several.

Would something like that make sense?

Multi-column scripts will be called once with the list of all columns Opus wants them to fill in.

If they only fill in some of the columns, they'll be called again for the remaining ones.

Why can't the script do both in the first call? (Build the list of related items and fill in the requested data.)

One might think that's the case, since the manual kind of implies it. But if you run a test, you'll see it's not. As I said, the first time the script is called, it's invoked for all items; if you don't return values for some (or at all), it gets called a second time, but only for the visible items on screen.

Because you don't have a realible way to tell beforehand which items need the values. You could think that all the items in tab.all, but that will not always be true.
So p.e. the script is called for the very first item, and you need to get values in batch, but you don't know what the next item would, and the next one after that. You can't also know which items already have values (unless you use some kind of cache, which I explain earlier why this is not a good idea in some cases and how this difficult the writing process). So I'm not sure what you mean to do it all in the first call. Hope now my point is clearer now.

Pretend I said "for the remaining ones, as required" then. These are implementation details and you shouldn't design the script around them, as they won't always be true anyway (e.g. populating columns for infotips as the user moves the mouse over random files, or as new files are added to the folder, or a multi-column sort where the script column is the 2nd one and only required when the 1st columns are the same for a pair of files, etc.).

The way columns work in Opus (and File Explorer for that matter) means the full list of files that might need filling in often isn't known at the time the script is asked about the first file(s). Scripts have to handle that situation one way or another.

It's pretty rare for column details for one file to depend on those for another, but if that's the case (and it takes a long time to calculate them), then I think you'd want either an in-memory cache for some amount of time after the data is calculated or an on-disk cache to store the details, else lots of things are still going to be slow (e.g. infotips).

It's ok then. That's what I asked in the first place.
But consider adding an option to call the script for a given item a second time, so we can enumerate them if needed during the first call. And maybe add a way to indicate why the column is being calculated (viewing, filtering, infotip, etc.)

Sure. I'd expect that most of the time you need the related items because it’s faster to get their values in a batch rather than one by one, not because the values are interconnected somehow. Eg. fetching a value from Everything is 2–3x faster if you request results once or twice instead of calling it for every item.

Revisiting this.

To avoid issues like this, what if the script could return false when Opus calls it for an item? If it does, Opus will call the script again. You could use that to build a list of items on the first call, return false, and then on the second call process all related items in bulk.