Plugins (Python): Alternative Language Bindings for Plugin API

So for a while now, I've been thinking about working on a plugin to allow writing plugins in a language other than C/C++. (Most likely Python, though writing a .NET binding might be interesting, since I haven't had much of an excuse to use CLI) At one point, I even started working on a Python binding, but I ended up getting sidetracked with work and accidentally deleting what I had written thus far a few months later. Anyways, it's looking as though my schedule may clear up a bit in the next few weeks, so I've begun considering the idea again, and I wanted to run my idea past someone knowledgeable on the framework to make sure that my current understanding is correct. For now, let's just assume I'm attempting to write a Python binding to the VFS plugin interface.

Ideally, I'd prefer to avoid forcing any would-be developer into having to recompile the base plugin with a different identifier just to use it for their code. Instead, my idea was to have the plugin search a sub-folder for Python eggs, then attempt to call a pre-arranged registration function on main module of any egg it finds. That registration function, in turn, would define the extensions handled by this module, in addition to the capabilities.

Now building a dynamic list of file extensions should be easy enough, if I'm reading the current documentation correctly (And please let me know if I'm misunderstanding here): I'd simply add VFSF_MULTIPLEFORMATS to the dwFlags field on my VFSPLUGININFO struct and when Dopus calls my plugin's VFS_QueryPath function, check the path against the currently registered file extensions of my Python modules. The dwCapabilities also shouldn't be an issue, since I specified the VFSF_MULTIPLEFORMATS.

As long as I'm correct in my current thought process, it all seems pretty doable. My only real concern is the possibility of performance issues associated with the dynamic nature by which the plugin would generate its capabilities, file extensions, namespaces, etc.

An approach like that should work, yes.

If there are any issues then we may be able to extend the VFS API to resolve them, so if you get stuck on something tell us and we may be able to help either with API changes or with suggestions on other ways to do the same thing.

The VFS API has been extended significantly in Opus 10 but we haven't updated the API docs yet. (Nobody seemed to be using them so it's been left as a low priority, but I've compiled a list of things to add/update to the docs.)

We've extended the "multi-format plugin" concept in Opus 10, which will probably be useful here. For example, if you look at the list of VFS plugins in Preferences you'll see plugins for lots of different archive formats which can be turned on and off, and configured, individually but are really all provided by a single DLL.

Capabilities shouldn't be too much of a problem as they should come down to each namespace, rather than the plugin itself. For example, the archives plugin has some namespaces which are case-sensitive (e.g. TAR archives) and others which aren't (e.g. 7z archives), and that's not a problem.

If you're dynamically searching for and querying python code at startup then that might be a performance issue (especially as Opus can initialise, release and later re-initialise plugins as they are needed). If that turns out to be a problem then it should be possible to avoid it by caching the information. e.g. Keep a config file which records the result of querying the scripts plus a list of script files/sizes/dates that were queried, and only spend time re-building those results if you notice some scripts have been added or removed. That way you only get the performance hit when it's needed; most of the time the scripts won't have changed.

What sort of things do you have in mind for the scripts to provide?

We've been thinking of building a "high level VFS API" based on the Opus 10 archives plugin, which would make building certain types of VFS plugins much easier. (Basically, anything that behaves like the RAR or 7-Zip libraries could be plugged into a much simpler API, where all it had to worry about was listing directories and providing data streams. The other complexities would be taken care of, e.g. ensuring single-threaded access to non-threaded code, caching directory listings, and so on.) It depends what you see the python (etc.) scripts doing, though; in some cases it may be easier to build directly to the VFS API instead of a higher-level one.

If it'd be easier to discuss questions/ideas on IRC or similar, we can do that, too.

We'd definitely like to see more plugins written, and opening things up to Python and .Net could be interesting and make this side of things more accessible to more people.

.Net may bring some extra complexity, since Opus also hosts shell extensions which may (although they are not supposed to) depend on conflicting versions of the .Net framework. There are solutions to that, though. (e.g. Hosting .Net plugins in a separate process which proxies the API calls back and forth. It'd be some boring plumbing work but probably not technically very difficult to do, if needed.)

Edit: Wow, this got really long really quick. I'll go ahead and highlight the actual questions/confirmations to save you some time.

Yeah, I can understand the lack of priority. I've been a little disappointed with the lack of community additions on this front, so I hoped that opening up the interface to some more accessible languages might lead to more activity.

As for the configuration, I'd have to agree that making the process too dynamic could be a problem. In addition to performance issues, I'm sure threading would be an issue I'd have to work around. A couple of solutions occurred to me, but the best way to approach this seems to be the following:

In the DllMain function of the plugin when given that the fdwReason is DLL_PROCESS_ATTACH, the plugin locates and reads a configuration file. (Or creates the default configuration, in the case of a first run) This is assuming, of course, that Dopus isn't constantly loading and unloading the plugin into memory, which I probably should check into.

The plugin configuration dialog would then consist of the following:

[ul]
[li]Any general settings that I see the need to add during the development process. (Possibly options to add folders to the Python Path of the executing interpreter, etc)[/li]
[li]A listbox containing all the current Python VFS modules currently detected in the designated folders. From there, a user would be able to enable or disable a specific module, or see information pertaining to that module. Glancing at the opus7zip plugin and the VFS plugins configuration page, I notice that it is indeed reading the single module as multiple plugins, so I may have more questions to ask once the time comes to actually get started on this.[/li]
[li]An eventual goal is to allow each python VFS module to provide its own configuration page, but that's something I'll try to address a lot later in the development. I think the win32ui module of the pywin32 package might help me with this.[/li][/ul]

This approaches relies on my assumption that while the preferences dialog is processing changes, it will not make any calls to plugins that may or may not be in the process of being disabled. (Which I understand might not be the case)

I should be able to handle the caching of the existing modules with a memory mapped file somewhere in the configuration folder. I'm still debating whether or not I want to allow a developer to make changes to a module during runtime, and have them automatically detect, or if I should put a button into the configuration page to manually reload a module.

Having the developer put some sort of XML manifest file in the root folder of a VFS package, or specifying a standard filename for the module responsible for registration should also improve the performance by alleviating the need to check every python file it finds for the predefined functions/classes.

Functionality-wise, my end goal is to allow for full implementation of Dopus VFSs in Python. Eventually, I'd like to see VFSs whose contents may not actually reflect real physical files. As an example, consider the following:

At the moment, I have a Gist account I post snippets of code to every now and then. Theoretically, I could create a file format with the extension of .gist-account containing my account details, and then write a VFS plugin that, when given a .gist-account file, will read in various configuration details, and then show the user's current gist snippets a collection of different files. Each file would contain the contents of that particular gist, allowing for easy managment in the context of the user's file browser.

Since there are a lot more existing Python bindings to different web service APIs (than there are C/C++ bindings), VFS plugins for things like Amazon Web Services should be a lot easier to approach, as well.

That being said, there are a couple of non-Dopus related issues I'll need to take into account. For starters, I'd like to avoid crowding Dopu's root program folder with a bunch of new subdirectories, and so I'll need to figure out the best way to handle the PythonPath, in addition to how it resolves dependent DLLs. To avoid some dependency conflicts, I'm considering using this modified Python distribution for the purpose of this project, though for performance sake, I really do prefer to compile as /MD whenever possible.

An additional precaution I'm considering is making further modifications to whatever Python distribution I use to rename the output binary. (Something like dopy27.dll) This might cause some complications and require changes for some of the more complex Python C-extensions out there, but it would assure the impossibility of a DLL-hell type of situation occuring if the user already has a Python distribution installed. (I know ActivePython installs the python27.dll binary to Window's system32 folder, so I've run into the problem before when attempting to embed Python) I might also be able to increase the performance a bit by going through the Python standard modules and removing conditional statements where it checks the current operating system of the user. (Since obviously, this distribution will not need to be callable from Mac OS X/Linux)

Is there any standard for where the plugins currently place any dependent libraries? If not, I can always use the AddDllDirectory function and delay-load the DLLs for Python and its dependencies.

I've done a good amount of C/C++ based Python extensions in the past, but other than some small utilities, etc, I haven't really played around with embedding Python into an application all that much. I'll probably go ahead and take a look at the source code of some of the existing applications with Python extensibility out there, and see if I can find any better solution or project structure for what I'm trying to achieve. I'm also going to look into the possibility of using a Python<->C/C++ automation framework like boost_python, etc to save myself some time with some of the simpler wrappers.

I'm going to be busy this weekend finishing off a project for a client, but unless something comes up, I should be able to start mapping out the project and writing some of the base structure of the plugin next week. I'll probably be hosting the source code for it on Github once I begin, so I'll link that once some progress is made. I know there isn't an official Dopus IRC channel, but is there any server/channels that you frequent, should I have smaller questions? I'm also pretty much always on Skype, if that works better for you.

Regarding the .NET bindings, I can't really say all that much. My experience with binding native and managed code is pretty much limited to some slightly advanced PInvokes in C#. I do tend to use projects like this as a learning exercise when I don't have a full understanding of a subject, but I'm a bit reluctant in this situation. While it'd be nice to write VFS plugins in .NET, I kind of feel like the overhead of the .NET platform outweighs the convenience. (Though to be fair, binding it to an interpreted language like Python isn't much better) The viewer plugin architecture does seem like a good place for .NET, though, so I'll have to look more into that if I can get the Python project to where I want it.

Instead of using DllMain, use the VFS_Init and VFS_Uninit functions which Opus will call. Note that they are ref-counted. See this post for more details..

DllMain is a bad place to do things beause you're not allowed to call most APIs from within DllMain. You can create and destroy critical sections there, but not much else. (Unfortunately, Microsoft don't provide a proper list of what you can and cannot do, but if it involves an Windows API call, or calling anything else which might call an API, then it's probably not allowed, with a handful of exceptions. The MSDN docs on DllMain have a bit more info but the best guidance I know of is in a series of blog posts by Raymond Chen and Larry Osterman.)

Frequent loading and unloading can happen. If a plugin is not used for about 10 minutes then Opus may unload it. I think during startup Opus will also always load, unload and then reload plugins, due to a quirk in how we do things. So it's best to design the plugins so they can be loaded and unloaded quickly.

If it is a problem then we may be able to provide a way for a plugin to say "never unload me," but if it's doing something time consuming when it is loaded then that will still slow-down launching Opus at least, so it would always be better if loading can be made as fast as possible (e.g. by caching data).

As an example unrealted to plugins, when Opus loads an IconSet for the first time, it converts it into a format which can be loaded much quicker and saves the converted copy in a cache folder. Creating the cached version actually increases the amount of time it takes to load the IconSet for the first time, but after that one-off hit the IconSet can be loaded very quickly directly from the cache. (Of course, it also checks that date on the real icon set so that if it changes, the cached version is thrown away and re-created.)

At the same time, don't "prematurely optimise." For example, most of my plugins load and parse an XML config file whenever they are loaded and I worried that slow down startup, when sometimes the config is never even used. I thought about making the config load the first time it was actually needed, instead, but when I timed the startup, all the plugins put together spent an insignificant amount of time loading their configs. So I left things as they were, because the extra complexity (and potential bugs) would have been for nothing.

All of that sounds fine and shouldn't be a problem with the current API.

It may require a restart of Opus for new Pythyon VFS modules to be seen by Opus, but I'm guessing these won't be things that people add & remove throughout the day so that seems okay.

Opus can/will call your plugin even while the Preferences dialog is open. e.g. You can open Prefs and configure the 7z plugin, while still using a 7z archive in another window.

It's also possible for more than one plugin configuration window to be open at a time, FWIW. Or at least for the plugin to be asked to do that (it doesn't have to comply). (It's not easy to do via Preferences, but people can create buttons which open a particular plugin config dialogs, and then click the buttons from differnet windows.)

If it needs to, your plugin can choose to block those things from happening by detecting the situation and failing the API calls, but it's something you'd have to do in the plugin and Opus (by design) won't prevent it from happening.

Of course, this can lead to weird situations. If someone is in the middle of copying data out of a VFS plugin which they then disable, what should happen? That's up to you, really. :slight_smile: With the archives plugin, I generally take the approach that if an operation has started then it should be able to finish, but there are some cases where you'll get an error message instead, which I think is fine. (People turning things on and off while those things are being used shouldn't be too surprised.)

Definitely, that should speed things up quite a bit.

I don't know how Python works internally, but even if the DLL has a distinct name, it could still conflict with another Python DLL loaded into the same process if they use named objects or similar.

If there's a worry about conflicts with other shell extensions etc. using different versions of the same DLLs/frameworks, I think it would be worth thinking about hosting Python and the scripts in a separate process, and writing an Opus VFS plugin which effectively proxies calls between Opus and that process. Doing that would also mean you can control when you are loaded and unloaded, since the process could choose to stay running when Opus unloads the plugin, and may give you more control over threading issues (if any).

Making things run out-of-process is extra work, but it's a lot easier to do if it's done from the start, rather than finding out later that it's a good idea and trying to retrofit it.

We usually put them in the same folder as the plugin itself.

If there are a lot of DLLs, it'd be better to put them in a subfolder to keep things tidy.

(Also, any .DLL file will be loaded by Opus to see if it is a plugin, so it's not great to put lots of extra files in the plugin dirs. Opus has a hardcoded "blacklist" of DLL names it knows are not worth loading, so we can add the Python DLLs to that if needed, but if there are a lot of them then putting them in a subdirectory would be better all round.)

Is changing the DLL path neccessary? If it's done incorrectly it can cause security/compatibility issues and is also difficult to do properly in multi-threaded apps like Opus (since the DLL path is shared by all threads), so it's best avoided (but sometimes can't be avoided, of course).

Normally, it's best to call LoadLibrary with the full path to a library, and well-written DLLs should be fine with that regardless of where the DLL path or current directory is at the time. (Not all DLLs are well-written, of course!)

Of course, if you host Python out-of-process (that is, in your own exe which the plugin talks to, rather than everything in dopus.exe itself) then many of these problems go away.

I'll be idling in #dopus on QuakeNet. (I'll have the channel minimized but hopefully the IRC client will properly highlight it if there's any activity. If not, nudge me with a private message. Or I may just be asleep or AFK. :slight_smile:)

Using C++/CLI may make sense, if .Net stuff is looked at eventually. I used it briefly in the past for a similar task and it seemed to make it quite easy to bridge C++ into .Net (not surprising, really, hehe). It can also act as a thin layer which calls into the main C# code.

C++/CLI is much better than the older Managed C++, but of course still suffers from the fact that so few people use it and thus there aren't always good places to look for advice. This sort of thing is probably where it shines, though.

Been meaning to respond this this thread for a couple of weeks now.

I appreciate you taking the time to give me all your feedback and advice. It really helps to bounce ideas off someone knowledgeable on the subject to figure out the best way to approach this project. Almost all of my projects for work over the past few years have been targeting web development, rather than the desktop, so I haven't had many excuses to use C/C++ for anything. As a result, I'm pretty sure my understanding of multi-threaded applications, Win32 API, and general C/C++ knowledge have degraded pretty heavily, so I think I'm going to need to do some reading to refresh myself. Additionally, I definitely need to do some reading on embedding Python in multi-threaded applications, so that I can make sure I don't make a huge mistake from the start.

Anyways, just wanted to say thanks again. I haven't abandoned this project or anything, though my recent migration to Windows 7 x64, and new projects at work have kind of put in on hold for a bit. As before, I'll definitely let you know if I have any other questions.

Hey Juntalis, did you ever make any progress on this project? Do you have any code for it on Github?

To be honest, I've been holding back on updating this post until I could post something substantial. I'd rather not bore you with the details of how this ended up being the case, but I've currently got the following laying around in my "Projects" folder:

[ul]
[li] An incomplete set of C-bindings for a Python API to the support API. (More or less abandoned when I couldn't come up with a clean way to package the whole thing together. I disliked the idea of releasing two complete Python distributions with it)[/li]
[li] A complete set of C-bindings providing an Lua API to the "support" API, and an (ugly) VFS implementation. (This was actually put together as an excuse to familiarize myself with the Lua extension API. Upon gaining a certain level of comfort, I sort of lost motivation, given Lua's lack of built-in functionality. Nonetheless, I can probably clean this up and release it if someone actually wants it)[/li]
[li] About 75% of the support API wrapped with managed classes. (.NET bindings) Additionally, a MSIL stub implementing the viewer plugin api. (Needed to write it in MSIL to allow for native exports in the resulting dll)[/li][/ul]

I'll probably continue to work on the two remaining active projects in my spare time, but I don't mind dumping what I've currently got on github. (Give me a bit to go through and cleanup the current folders. I'll try to get the source up sometime this week/weekend)

Ping! Juntails have you made any progress on this? I'm not a C++ or Win32 person so Python bindings would be great!

'Fraid this didn't really go anywhere, and given that I haven't found the motivation or time to mess with the scripting packages functionality released in Dopus 12, I doubt I'll return to it anytime soon. (Sorry - coding at work has left my with little motivation to pursue personal stuff)

Most of what I had on this was unusable half-finished sandbox code. I did throw some .NET interop bindings for the support stuff on github, but without having a base plugin for initializing and managing the required AppDomain, that doesn't really help anyone. That being said, I'll leave some of the information I can remember just in case someone else decides to make a similar attempt:

  • Given the described design of the plugin system, the best approach I could find to do this was the one Leo suggested: hosting python in a separate process. This solution sidesteps a lot of potential issues that are likely to arise hosting Python in the Directory Opus process. (mostly dealing with Python's threading support and initialization requirements) It also allows you to package the plugin with just one distribution of Python, as opposed to both 32-bit and 64-bit distributions.
  • If that doesn't dissuade you from hosting Python in the Directory Opus, at least make sure you use a more recent version of Python 2 or 3. Older versions don't seem to properly handle the whole side by side assemblies thing.

Note: Learned a lot about Python's internals attempting this project - good project for anyone that's looking for an excuse to learn about Python embedding.

1 Like

What kind of viewer of VFS plugin do you want to write in Python?

Ah. I'm still on Dopus 10, so if it needs v12 it'll have to wait.
I wanted to create some kind of panel like the metadata panel,but for ebooks: show a cover, some blurb, ISBN and anything else. Dopus is already a great file browser, but the viewer pane isn't quite what I'm after.
Calibre - the ebook catalog software - can do this but it moves everything into its own folder structure, and I want to keep mine.
Anyway, thank you both for replying, I'll have to re-think how I approach this....

1 Like