DOpus is NOT Unicode at all in file comment support (desript.ion)
And is disastrous to use it!
(1) If you have file comment enter in non-english, all characters are shown in little squares, if you do any copy/move, they are gone.
(2) If the file name is non-english, then the file comments are dropped (even if they are in english). If you have comments for only non-english files, then you hit the jack pot, the descript.ion file is deleted.
I tried adding some Unicode comments to files, both with and without Unicode filenames as well, then copying and moving the files around. Everything seems to work fine to me, including on a Windows 2000 virtual PC.
Are you running the Unicode version of Opus? (The Unicode Opus will work fine on Windows 2000 and Unicode characters won't work in the non-Unicode version of Opus.)
Do you have the Preferences / Display / Colors & Fonts / File Display Font (in the "Item" drop-down list) set to a font which has the Unicode characters? (Arial seems to be a good one to try.) If not that would explain why you see squares. It isn't that the character has turned into a square; the font just doesn't have the symbol to display it.
Do you have Preferneces / File Operations / Copying (1): Preserve the description of copied files turned on? If not then file descriptions will be lost when files are moved or copied.
If none of the above solves the problem, which character(s) are you trying with? I picked a random arabic one but maybe there's something special about the characters you're trying. Let me know and I'll try them on my system.
[quote="nudel"]Are you running the Unicode version of Opus? (The Unicode Opus will work fine on Windows 2000 and Unicode characters won't work in the non-Unicode version of Opus.)
[/quote]
Yes, I am using Unicode version on Chinese Windows 2000.
The font is native & display chinese font well.
I do. English only comments are preserved & copied.
With some thorough thinking, I believe the problems lies with how my descript.ion files are created.
I use Acdsee all the time as a general purpose file manager. It is English only and ages old which I am seeking for a replacement. File comments created are stored as-is. Therefore the description stored are of code page 950. That is why the comments are screwed. & with comments on non-english file names, the file name part is also stored in code page 950, that is why DOpus doesn't find a match on the directory listing, so they got removed.
Though Win2k,XP uses unicode internally, when you talk about text file, they are still in localised code pages.
Localised code pages uses DBCS and they can be edited with notepad without any problem. (NOTEPAD can't read unicode text files!)
Even if I use MS word (Ver2000) & instruct it to save a text file, it would save in my localised code page without asking. SOme more thoroughly thought & designed editor (like Emeditor) does provide way to ask me to save in what code pages. Because they know that there is going to be problems.
I am NOT trying to say using localised codepages is the way to go or is better, but they are still all over the place. Simply assuming text file to be unicode is ... not yet ready.
I was trying to prepare a descipt.ion file with DOpus & see what would I see it with notepad, but then I realised that I never try it, because I can't find the hot key, nor in the right click menu.....
Anyway, if DOpus is supporting this decade old mechanism for file commenting, it should avoid problems that I am hitting into. Or, DOpus can revise the file commenting method and go all the way to Unicode. (Like using a differnt descript.ion name to avoid compatibility problem, Mm... actually it doesn't have to be compatible at all.
Or put some marking so DOpus can read in descript.ion & can detect it to be DBCS or unicode & won't do the wrong handling)
Just some thoughts....
At least in XP Notepad will edit Unicode files just fine. (See screenshot.)
On the main issue, I guess it depends how other programs using descript.ion work, really. And maybe all the other programs are not consistent anyway so it's impossible to interoperate with all of them.
Opus appears to be using UTF-8 to write Unicode into descript.ion files. That measn that all non-Unicode characters will still be single-byte, ASCII characters, but the other characters, which will be two or more bytes long, may look like garbage to applications that don't understand UTF-8.
Presumably nothing goes wrong unless/until ACDSee modifies the file? If Opus is using a standard, or has invented one where there was no standard, perhaps you should ask the ACDSee authors if they could update their product to support it? Or if there is another standard which Opus should use, let us know.
Alternatively, Opus could write ASCII-only data to descript.ion and create a second file in its own format for any Unicode data... But that seems quite messy.
I think you're misunderstanding where the problem occurs. Go to a folder with no file descriptions and add some Unicode comments using Opus, then edit the descript.ion file. You'll see it is UTF-8 (i.e. you can read most of it even in an editor that only understands ASCII) and it has a marking at the top which says it's using UTF-8 encoding, just like the marker you suggest.
I believe that it is ACDSee or some other tool which is causing your file desriptions to go missing... Or perhaps ACDSee and Opus are using different formats and confusing each other. Opus in isolation works fine, so we have to work out how to make everything work together, but some of the things you're suggesting are already done...
No, I don't.
Acdsee treats descript.ion as a single byte text file, ASCII file. It is windows that interpret DBCS.
If you are saying that DOpus uses UTF8, then viola, DOPus should be able to distinguish between ASCII vs UTF8. As UTF8 stores the first 127 characters in two bytes aw well. (As your attached screen shows)
ACDsee will create description file without those filling zeros.
No, it is DOpus that keeps deleting my descript.ion
I checked this with a command prompt window, the descript.ion goes away after DOpus opens the folder & close it.
It is logical that DOpus reads in "FILENAME" "COMMENTS" where the FILENAME doesn't match the Unicode filename from FindFirst and drop that line at all. When the list empties, the file is dropped.
All my existing descript.ion are created with ACDsee, so it won't do anything harmful.
UTF-8 is not 2-bytes per character. My screenshot is of a UTF-16 registry file, just to prove that Notepad can edit Unicode files. A UTF-8 file looks like a normal 8-bit ASCII file, except where there are non-ASCII characters, and potentially right at the start where there may be a special marker to say it's a UTF-8 file.
Please could you attach an example descript.ion file made by ACDSee which triggers the problem in Opus?