Column: FileInfo (number of lines, EOL line-endings, encoding of text files, more..)

tbone · August 18, 2014, 1:39am

Introduction:
A set of columns providing information about files.

Columns:
Lines - number of lines
Rating - rating (stars) of items
EOL - line endings (UNIX, PC etc.)
Encoding - file encoding (UTF8-BOM, UTF16 etc.)
FirstLine - first line of text/ascii files

Notes:
Add your own file extensions in case they are not pre-configured yet in the script config section.
Original idea by Kundal (and a button version as well): Text file line count
This version adds script config, some extensions, supports 0byte -> 0lines, is case-insensitive and 33% faster (noticeable for bigger files).
Encoding detection is stolen and reshaped from user Qiuqiu (Column text file encoding).

Installation:
To install the column, download the *.js.txt file below and drag it to Preferences / Toolbars / Scripts.
After that, right click any listers column header and add the columns from the "Scripts -> FileInfo" submenu.

Download:

Latest:
v1.3.3 / 2017.11.20 - new column Rating, fix for uppercase file extensions being ignored
Column.File_FileInfo.js.txt (21.4 KB)

v1.3.1 / 2016.12.07 - new column FirstLine
Column.File_FileInfo.js.txt (16 KB)

tbone · July 8, 2016, 8:23pm

Updated to v1.2:

converted to js
added EOL column to detect line endings (quite a basic implementation)
added Encoding column, core logic stolen from user qiuqiu (Column text file encoding)
added qiuqiu's list of file extensions and some others as well
native url (data.url) added

playful · July 14, 2016, 11:34pm

Great idea and great stuff!

Man, you and others keep coming up with great script ideas.
I haven't kept up with all the new creations since writing the pages about scripts and the file naming convention thingy on dearOpus. Need to look at all the new goodies and update, will try to do that soon.

Thanks for all the great scripts. I use ListerDoubleClick constantly.

playful · July 15, 2016, 9:21am

I should report that a file that EditPad Pro sees as UTF-8 / No BOM shows up with a question mark.

Also: added FileInfo to that very page. Need to make time to explore all the other cool stuff.

tbone · July 15, 2016, 12:25pm

Thanks! o) That's currently by design, since proper detection of UTF-8 NoBOM always involves analysis of file content and a good portion of guessing and assuming the right thing. The Encoding column currently does not use very sophisticated methods to detect the encoding, it just checks for existing BOMs. But: If you look at the source, you'll find a "todo" entry to eventually add the "magic" which Editpad Pro and others use to also detect UTF-8 for NoBOM files "correctly". Cannot say when this will happen though, support for analysing any kind of file data by DO is given, so.. o)

playful · July 15, 2016, 12:42pm

Makes complete sense, thank you for explaining.

qiuqiu · July 15, 2016, 12:45pm

No BOM can only guess.

tbone · December 7, 2016, 5:35pm

Updated to v1.3.1:

added another column "FirstLine"

EDIT: Correction for column name.

tbone · November 20, 2017, 7:16pm

Updated to v1.3.1

added "R" rating column
fix for uppercase file extension being ignored
XLog added
.

@Sophokles
There you go! o)
Danke für's Bescheid geben! o)

Leo · November 20, 2017, 7:55pm

What's an R rating in terms of text files? I haven't heard of that before.

tbone · November 20, 2017, 9:06pm

Now that you ask, I probably should edit some more text in the initial posting. It's an attempt to show exactly what the regular Rating column shows, just with plain text/numbers (5 stars = "5"). I'm not sure yet this works 100%, as my impression is, that there is no way to get the very same rating via scripting? The column also is not restricted to text files, misleading description, will fix! o)

Leo · November 20, 2017, 9:44pm

OtherMeta.rating should give you the rating in numeric format.

tbone · November 20, 2017, 10:19pm

Yes, I think that's what's used right now. But a past investigation showed, that rating can be in meta.doc as well for the same type of files, while meta.other being null/undefined. So the question is, what "chain" does DO run through to determine the rating for the rating column. Is it meta.other first, then meta.doc or the other way round? What about ratings in meta.audio etc.. You know what I mean? I probably can check them all, but I think the determined rating can still be different to what DO shows in the native Rating column - maybe I'm wrong here. I remember it being difficult to yield the same rating values. It's some months since then.. don't mind me talking bs. o)

Leo · November 20, 2017, 10:37pm

Not sure, but if it is doing that then the main Metadata object's string value will tell you which sub-type to go for, which is probably the best bet to pick up the rating from.

Rated_RR · January 7, 2020, 1:52pm

@tbone - the FirstLine column seems not to work.

tbone · January 7, 2020, 3:07pm

What file extension/encoding is affected?
It works here, although i can see some issues with BOMs and utf8/utf16.

opw62 · July 28, 2024, 2:26pm

Allow me revive this old thread.
Have searched and search on Internet to find a tool that lists the text encoding. In vain.
I stumbled over this script. Looks wonderful! Really. However, in many cases the column remains empty (extension .srt) or a question mark (.txt). When I open the files, they appear to be Western 1252 or UTF-8 no BOM

Right now I need to open each and every file to see the encoding.

Not sure if it is possible to extend the script so it handles .srt files (or any txt file like, such as .ini) to include 'no BOM' and Western 1252 ?

Thanks!

SnagIt-28072024 160844

Leo · July 28, 2024, 2:37pm

The built-in Description column will report text types when there is a BOM.

When there isn't a BOM, there isn't any reliable way to know a text file's encoding. You can guess, and test if something uses characters that seem outside what would normally be used in a given encoding (which can only tell you it probably isn't that encoding, not what encoding it actually is), but it's guesswork and goes wrong all the time, unless you know something about the text files to weight the probabilities (like "they all came from Windows and only people who wrote in Greek").

opw62 · July 28, 2024, 2:53pm

Thanks. Bad luck then.

Coincidentally, I just now stumbled over a tool that is named "EncodingChecker v2.0" (on Github).
The interface looks nice, but regretfully I can't find a downloadable file and site being for developers, am afraid I can't make heads or tails of it.
Anyway, thanks again.

PolarGoose · January 2, 2025, 10:31am

@opw62,

I have created a plugin that shows the file encodings:
File MIME type column
It should work for your use case. It doesn't rely on extension and uses file content to determine the encoding. It doesn't need BOM also.