File MIME type column

The plugin shows the MIME type of the file based on its content and not its extension.
DirectoryOpus-FileMimeTypeColumn-plugin

Limitations

  • The plugin requires file.exe utility to be installed.
  • Unfortunately, this plugin is very slow for folders with a lot of files.
2 Likes

Update:

Thanks!

I was looking for a way to grab this kind of info. I did not know about the file util until now.

Some feedback/suggestions that might help improve this:

  • It seems you can process multiple files if you pass them in a text file as a list, using the --files-from argument. Just ensure you don’t do that if there’s no tab object present.
  • Even faster could be caching the retrieved info (e.g., in an ADS in the same file).
  • Opus allows you to get multiple values for multi columns script. You’d only need to use --mime once, then parse the output to populate the two columns.
  • If Git isn’t installed, you need to set the magic file manually (cause no magic environment variable). Adding a new entry in the script config to define the magic file’s location would be very handy. This way, you can use the columns even in a portable setup.

@errante,

Thank you very much for the detailed feedback.

  • Opus allows you to get multiple values for multi columns script. You’d only need to use --mime once, then parse the output to populate the two columns.

Good idea.
The problem is that the Opus will still call the column.method for each column. Thus, I have to cache the values so that I can ignore the second call if it is for the same file.
But I think it is easily doable. It will save 1 file.exe call per file.

  • If Git isn’t installed, you need to set the magic file manually (cause no magic environment variable). Adding a new entry in the script config to define the magic file’s location would be very handy. This way, you can use the columns even in a portable setup.

Yes, indeed, I will add an option to specify the magic file location.

  • It seems you can process multiple files if you pass them in a text file as a list, using the --files-from argument. Just ensure you don’t do that if there’s no tab object present.
  • Even faster could be caching the retrieved info (e.g., in an ADS in the same file).

Yes, it would be ideal to just call file.exe for all files in the folder. However, I don't know what implications it will have. For example, if the folder is really big, it might take ages for file.exe to execute.
Caching is very tricky. It will make the script very complex. Also, cache invalidation problems.
It would be ideal if we could load the file.exe as a library inside the process and use it directly. Similar to P/Invoke in C#. Then, it would be much faster than constantly calling it from a command line.

Let me fix the low-hanging fruits first. We can see what can be done next.

Not 100% sure about this, since I only discovered it myself a few weeks ago. If you "process" all the columns in the scriptColData.columns map (e.g. assign them a value, even if empty) during the first call, it shouldn’t get called again for the other columns. Worst case scenario, we’re only dealing with two columns—the data from file.exe is already there; you just need to populate the values where appropriate.

It will take even longer if you call it per file :slight_smile:

Indeed. Not many people realize that., unfortunately.
It’s great to see others sharing their work here since this space feels a bit empty at times.
I’m not an expert coder by any means, but I can hold my own with Opus scripting. If you need any help, don’t hesitate to reach out.
Best regards.

I have updated the script.

  • Improved the performance - call file.exe only one time for each file
  • Added fileExeMagicFileFullName and ignoreBinaryEncoding configuration parameters
  • Added an instruction on how to make a portable version of file.exe

@errante,

Not 100% sure about this, since I only discovered it myself a few weeks ago. If you "process" all the columns in the scriptColData.columns map (e.g. assign them a value, even if empty) during the first call, it shouldn’t get called again for the other columns.

Yes, indeed, you are right. I have implemented it in my script.

It will take even longer if you call it per file

This is true, of course. I need to think about it.

It’s great to see others sharing their work here since this space feels a bit empty at times.

I think it is also because DOpus is a very mature product, and the most interesting things that you can do with plugins are already integrated into the main program.

Overall, @errante, thanks a lot for your help.

1 Like

I have updated the script:

  • Print utf-8 bom if the file has BOM.

I have updated the script:

  • Don't show encoding or MIME type for directories.

I have updated the script:

  • Add MIME type aliases - it is possible to specify a short name for common MIME types.

I have updated the script:
Improve display of MIME type for exe and dll files:

  • Add architecture
  • Check if it is a .Net binary