Reliable labels and comments

I'm trying to understand how the metadata (color labels; icon labels; comments, a.k.a. descriptions) is stored. And how to store it reliably.

Please, correct me if I mixed up or missed something. Also, I want to understand which way is "better" from user's subjective point of view.

* * *

Depending on user's preferences:

  • Labels could be stored (1) in alternate data stream or (2) in internal Directory Opus database.

  • Comments could be stored (1) in alternate data stream or (2) in descript.ion files.

Sadly, each way have too many flaws.

The flaws

What is the disadvantages of using descript.ion files for comments?

When you copy or move the file into another folder, the comment will not be copied.

What is the disadvantage of using DO internal database for labels?

The first flaw is that such labels will not be copied, the same way as it was described about the comments, see above.

The second flaw. Let's assume we have Foo.txt with green color label and "Checked" status. If we try to rename it to Foo1.txt, both green color label and "Checked" status will not be preserved.

And even worse, if we copy the file with the same name (Foo.txt) from some another directory to this one, it will be automatically colored to green and it will be assigned "Checked" status to it. In 99% cases we don't want this behaviour.

What is the disadvantage of using alternate data streams?

Flash drives are widely used across users. If I want to share some document with my friend and we have not internet connection (or the document have very sensitive information), I will copy this document to flash drive and give it to my friend on our next meeting.

But when I receive a flash drive with edited document from my friend, all alternate data streams will be lost, because it was copied from FAT32 (default file system of flash drives).

Possible solutions and your own workflow?

What's your own way to store reliable labels and comments?

If Preferences / File Operations / Copy Attributes / Preserve the descriptions of copied files is turned on then the descript.ion comments will be copied/moved with files, provided you do the operation using Opus.

Note that this, as with most of the options on that page, may slow down file copying, especially when copying lots of small files/folders.


Another disadvantage to consider with ADS-stored metadata is it may be lost when the file is edited by other software. It depends on the software, and how they overwrite or replace the old version of a file with its new version, but it's something to keep in mind.


In general, you won't find a perfect system, since there is no way to store a comment for an arbitrary file type in a way which will be preserved no matter what is done to the file in other software, without the cooperation of the other software.

Some file types have built-in ways to store comments (e.g. ID3 tags in MP3s and EXIF tags in JPGs) which are usually (although not always) preserved by software that edits them, and are part of the main file data and thus always go where the file goes. But many don't (e.g. plain text files). Where they don't, you will always be left with a choice between compromises.

In general, you won't find a perfect system, since there is no way to store a comment for an arbitrary file type in a way which will be preserved no matter what is done to the file in other software, without the cooperation of the other software.

From my point of view, the best way would be to use internal DO database for both labels and comments. But there are 2 problems currenlty:

  • DO internal database doesn't store comments, only labels.

  • It works in a very strange way, which was described in "What is the disadvantage of using DO internal database for labels?" section of original post. It would be nice to see more intelligent database behaviour.

The way the database works is by design. It allows people to tag a path rather than the particular file/folder at that path right now.

(For example, you might have a Current Work folder that you want highlighted so you can find it easier, and each month you rename it to May 2018 Work or similar and create a new Current Work folder.)

That said, we aren't against the option of it working the other way, which can also be useful. It depends what you're aiming to do. But the NTFS ADS method covers most of that situation already. (Until the files are moved off the machine, of course; but that problem affects both methods.)

The way the database works is by design. It allows people to tag a path rather than the particular file/folder at that path right now.

(For example, you might have a Current Work folder that you want highlighted so you can find it easier, and each month you rename it to May 2018 Work or similar and create a new Current Work folder.)

Yes, I understand. But this is achieved by the cost of ability to reliably tag specific files and folders. And from my point of view, this missing feature is much more important. It's your vision of what is more important (files+folders OR paths), but probably it could be made some survey among users. I believe, the option to reliably tag specific files and folders instead of specific paths will win.

Until the files are moved off the machine, of course; but that problem affects both methods.

ADS are lost even when the user just copied files to flash drive and back from flash drive to NTFS hard drive. So, take a labeled file from your PC, move it to flash drive and edit this file on another PC. Yes, you will not see color label on another PC - not a big deal. But you will not see it even after you copy this file back to your original PC.

Why not format your flash drives as NTFS?

Why not format your flash drives as NTFS?

Yes, it could be considered as workaround. But you need to remember to format them and there are possible problems if you exchange files across PCs and Macs. And also sometimes you may work with someone's else flash drives - the owner will not be happy to see the filesystem changed.

So, yes, it works, but not convenient.

Also, as Leo pointed out, the flash drives are not the only problem of ADS: it may be lost when the file is edited by other software.

For me, labels (both color labels and icon labels) and comments are probably the most important thing in file organizing. It's important to know they are stable as rock.

If you copy a file onto a flash drive and give it to someone else, they're not going to have any labels that are in a database on your machine. ADS is your only real option when moving between machines; unfortunately it's not perfect.

If you copy a file onto a flash drive and give it to someone else, they're not going to have any labels that are in a database on your machine.

Yes, I understand. I complaining that there is no really good way even on single machine, because:

  • ADS could be accidentally removed by another software

  • ADS will be lost if you copy file from NTFS to FAT32 and then back to NTFS.

    • Not sure, but it could be also a problem for web backups. Upload some file to your SkyDrive account or GMail, and then download it back to your computer. I'm not sure the metadata will be kept. Edit: It depends on the site. Some sites will remove ADS, some preserve.
  • Internal database works strange (from my own subjective point of view). It is optimized for tagging specific paths instead of specific files/folders.

Theoretically, there exists a way to store all metadata reliably (on single machine, of course). To achieve it, all metadata (color labels, icon labels, descriptions, tags, etc) should be stored in internal database and the behaviour of this database should be changed from path-oriented (as Leo described) to file/folder-oriented.

As I understand, there are no plans currently to change behaviour of internal database from path-oriented to file/folder-oriented?

Those issues are inherent to the situation. A database could not solve them either. If another program copies a file to somewhere else (potentially via a web archive, months later), Opus has no way of knowing that and updating a database.

I think you're looking for a solution to an impossible problem, to be honest.

A database could not solve them either.

At least, the problem with accidental deletion of metadata by 3rd party software will be solved, as I understand. And this is the main problem. That's why I will be glad to see the change in database behaviour.

I think you're looking for a solution to an impossible problem, to be honest.

I also thinked today about a script-based solution. The idea is to use descript.ion-files for comments and to store everything inside them. So for example, the description could look like this:

[CL=Green, IL=Pinned] Here is general part of description

CL stands for color label, IL stands for icon label.

Then, the script will dynamically read this data from descript.ion files and reapply it normally. The event of reapplying could be invoked manually or automatically.

The main problem, as I understand (I never worked with DO scripts) will be merging, copying and removing excessive data from different descript.ion-files (automatically, of course).

Simple example. We have:

  • Folder1 with files aaa.txt, bbb.txt and descript.ion

and

  • Folder2 with files foo.txt, bar.txt and descript.ion

And then we want to copy file aaa.txt to the Folder2. So, the script should take the info about aaa.txt from first descript.ion file and add it to the second. There will be many pitfalls like this. A lot of work and I'm not sure if it could be made really reliably.

Still here, still hope this feature will be implemented. Hell, I don't want to pay for Mac just to use Finder (it's a lot of money in my country). I hope one day DO will implement it (I will be happy to pay double price for it in this case).

Sorry for repeating myself: I want the color labels to be stored inside the local database and binded to files/folders instead of the path. This is the way how XYplorer do it. From the above discussion:

The way the database works is by design. It allows people to tag a path rather than the particular file/folder at that path right now. -- Leo.

This design doesn't work for me. And alternate data streams aren't a way to go as well: quite often they are erased by the programs that I use.

I don’t understand how a database can refer to a file or folder except by its path. Are you saying if you label a file in XYPlorer and then rename the file while XY isn’t running, the next time you start it it will still show the file as labelled?

@Jon Let me explain. Here are our files and directories:

|-- Dir1\
|   |-- Foo.txt
|
|-- Dir2\
    |-- <empty>

XYplorer

Example 1: You have assigned a label to the Foo.txt. Then:

  • you can move this file (from Dir1 to Dir2)
  • you can rename it (from Foo.txt to Bar.txt)
  • you can rename its directory (from Dir1 to Directory1)

Under all these circumstances the label will be preserved. The label is assigned to the path, but as long as you move/rename/copy/delete your files/directories within XY, this path will be automatically updated. Actually, XY labels database (tag.dat) is a plain text and here is how it looks:

Labels:
Red||FC7268;Orange||F6AB46;Yellow||EFDC4A;Green||B5D74A;Blue||5DA4FE;Purple||E29CDC;Grey||B5B5B5

Data:
C:\Test\Dir1\Foo.txt|4||||||||||||||||||

4 stands for green.

Then, if you move this file to Dir2, the record will be updated:

Data:
C:\Test\Dir2\Foo.txt|4||||||||||||||||||

(Tech note: tag.dat itself will not be updated unless you quit XY. So, it is necessary to quit to see the changes in the database. Unless you quit, the changes are stored in RAM. This behavior is by default, but it could be easily changed.)

So, it works very similar to the color labels in Finder. Poor man's Finder.

Directory Opus

Unlike XY, the paths in DO database doesn't have such automatic updates.

Example 2: You have assigned a label to the Foo.txt. Then,

  • if you move this file (from Dir1 to Dir2)
  • if you rename it (from Foo.txt to Bar.txt)
  • if you rename its directory (from Dir1 to Directory1)

...the label will be lost. (Strictly speaking, it is not lost, because it is not removed from the database, but it lose its visual association with the file it was assigned to. I call it "lost" just for short.)

Hi,

As a practice for predictable labels (labels already pre-defined and often used), I use a lot label assignements automatically done with regular expressions.
Concretely, it means I have specific keywords in the filename itself (in general forced as uppercase with an underline before and at the end or beginning of file).
For example i manage my projects states like that (todo, ok, ko, done, wait, sent, etc.)
An advantage is that you can change very fastly the label assigned to a file by just renaming the file, and also you don't have to bother with copying labels information because it is inside the filename itself.
Of course on the other side, if you create lots of label to give meaning for your files, this method will be too complex to use (as you will need to preassign regular exp with any of your labels).
In my config i use more or less 10 to 15 labels with this behavior.

@Leo Are there any plans to implement it?

Not in the near future, at least. If more people request it, we might reconsider it, but most people seem satisfied with the two existing methods of setting labels, and the third method you propose has issues of its own which we are not sure would be worth it, on top of the complexity of another label system.

Hi, Antoine. I would be appreciated to see how exactly you have implemented this idea. Could you post the corresponding code? :slightly_smiling_face:

Hi john,

it's not code, it's just :

  • add as many "status" label as I like (with colors, etc.)
  • then for each status, I add label assignment (to one of this status) with a regular expression like this one : Name match ^((DONE|OK).|.(DONE|OK)(.\w+))$
    in this example if I just add OK
    or _DONE at beginning of filename or _OK or _DONE at the end, it will apply my "OK" label to it

i'm not sure it is so useful in case of many labels but it's interesting with a bit of rigourous keywords.

i like also the idea of having keyword at beginning because I can see them sorted at top of my folder content. And also if I type in "filter mode / filter bar" the word OK it will filter for me my labels easily as it is part of filename.