Lately I had some discussion with customers and providers about codification of txt files, because some of them prefer it as Unicode, some as UTF-8 for the API's and some of them doesn't mind at all as long it can be viewed with notepad.
These people is like a pain in the ass for me; I didn't care much about coding until now.
The question is: is there any way in Dopus to know wich codification a txt file was saved with?
Does it do with the fact that sometimes the thubmail view does not display the inside of a txt file?
The TextThumbs plugin should be able to display thumbnails for all of those types of files, provided they have a BOM (byte-order-mark) at the start of them, which is mandatory for UTF-16 text files (but not written by some programs, unfortunately).
At least, that's how I remember it working. I would have to look at the source code to be completely sure.
Can you send me a couple of files which don't show up as thumbnails?
I'm attaching a rar file with two txt files. One is a product I created somehow (I don't remember) and the other is the result of the common command "create new text file" within right mouse click. They do not show up in Thumb view.
UEdit-32 says both of them are "DOS" files, which after reading throught its documentation means that are ASCII files, no Unicode no UTF-8.
Well, after reading this progtram help I'm even more confused about this stuff. I'm tempted to check the UEdit-32 "use Unicode as default for new files" but don't know if that is the wiser solution.
juegos.txt is a UTF-16 file but has no BOM at the start, so it looks like a binary file to the TextThumbs plugin (and the Opus Text viewer). If you load it into Notepad and re-save it then the new version will appear correctly in Opus since Notepad writes the BOM. (You can see the file grows by 2 bytes.)
The empty text file shows as an empty text-thumb for me. The TextThumb plugin cannot tell that an empty file is a text file by looking at the contents (of course ) so it looks at the PerceivedType value in the registry for the extension in question.
Have a look under HKEY_CLASSES_ROOT.txt where there is usually a PerceivedType string value set to text -- maybe this is missing from your registry?
The PerceivedType registry setting is only ever looked for when TextThumbs is deciding whether to handle an empty file.
At a glance, interesting to see how it wasted double space with 16 bit encoding; and how an UTF-8 No BOM can bee seen in the thumb, while an UTF-8 BOM can't be seen in thumb but do in Opus viewer.
Unless I messed-up the files (attached), this does not agree on what you said about BOM.
If I am not wrong, then my suggestions to development would be:
Thumb preview for as much codings as possible
Or at least, some coherence between the thumb plugin and the viewer.
What do you think?
P.S.: What a pain to make this seem a table
[quote="artema"]UTF-8 ---------- 15 ----------------- No ------------------ Yes -------
UTF-16 --------- 30 ---------------- Yes ---------------- Yes -------
UTF-8NoBOM ---- 15 --------------- Yes ----------------- Yes -------
UTF-16NoBOM ----- 30 -------------- No ----------------------- Hexadecimal ----
UnicodeASCII-Escaped ---15 ---------- Yes -------------------- Yes -------[/quote]
Thanks for testing the different cases! I haven't looked into this further but I guess I need to make the TextThumbs plugin understand UTF-8 BOMs. I'll make the change when I next do some work on the plugin, which should be in the next few weeks.