Can't get my plugin to be recognized by DOpus

Hi. I am trying to write a JPEG XL viewer plugin. I can't get it to be recognized by DOpus. I copy the DLL into the correct directory but it doesn't appear in the plugin list.

I grabbed TheZoc's plugin (GitHub - TheZoc/ZPatchViewer: A Directory Opus plugin to view the contents of ZPatch data files) and built it, and it works fine. I can't see what the difference is that makes their one work, my configuration seems to be the same. I tried attaching the debugger to DOpus but couldn't break anywhere in the DLL.

My repo is here: GitHub - kuro68k/JPEG_XL_dopus

Obviously it's unfinished.

I could just take TheZoc's code and modify it I guess, but it would be good to know what the issue is if it's not too much trouble to find. Looking at both of them, and various other viewer plugins, with DependencyWalker, they seem to be the same.

I am building the x64 Release and x64 Debug versions. I should delete the 32 bit builds from the project...

Thanks.

It appears in the list for me. I compiled without the actual JPEG-XL code to simplify things but nothing that should affect whether it appears in the list or not.

Some things to check:

  • After placing the DLL in the Viewers dir, it won't be loaded until you either click the Refresh icon above the plugins list, or restart Opus.

  • I recommend changing C/C++ > Code Generation > Runtime Library to the non-DLL versions ("Multi-threaded Debug" for debug and "Multi-threaded" for release), otherwise you will have to package the appropriate Microsoft C runtime installer/version with your plugin. If you were testing on a different machine to the one with the compiler, that could also explain why it didn't work.

  • Your debugging code at the top of DVP_IdentifyFileW and DVP_LoadBitmapW is passing wide strings to an ANSI function, which could make it crash:

BOOL DVP_IdentifyFileW(HWND hWnd, LPTSTR lpszName, LPVIEWERPLUGINFILEINFO lpVPFileInfo, HANDLE hAbortEvent)
{
	std::vector<uint8_t> jxl;
	if (!LoadFile(lpszName, &jxl)) {
		fprintf(stderr, "couldn't load %s\n", lpszName);
		return FALSE;
	}

The fprint there should be fwprint. (Or you can use fprint and change the %s to a %S to tell it to use the opposite string type.)

Visual Studio 2022 found that bug and warned me about it. Might be worth updating your version of VS if you're on an older one, as the compiler is more helpful with things like that (and it's free these days).

You're also mixing up string types in the function definition. DVP_IdentifyFileW should take a LPWSTR not a LPTSTR. It's OK as long as the project is compiled for Unicode (where LPTSTR will turn into LPWSTR), but doesn't make sense to use a T type for a function that always has to take a W string; it will go wrong if it's compiled in another mode.

3 Likes

Thank you Leo. I fixed the string issues and reproduced your build without the JPEG XL code. I think I ended up with code copied from the SDK and some other plugins so the strings got a bit mixed up. I don't normally do much C on Windows so thanks for being patient.

The plugin is then seen, although I've noticed that updating the DLL doesn't always make it refresh properly after that. You have to rename it (can't delete it because DOpus has it open), refresh in DOpus so it is removed from the list, and then copy a new version of the DLL in and refresh again. Perhaps before I wasn't doing that so I had some working versions that just didn't get loaded.

I also made your recommended changes re debugging settings. I copied all the JPEG XL DLLs into the plugin directory, so I don't think it's missing those, although I guess it could be a path issue.

So it looks like the JPEG XL code is the issue. Any calls into the JPEG XL library cause the plugin not to load. Does DOpus do test calls to DVP_LoadBitmapW() or DVP_IdentifyFileW()? If not then it's something about the compiler including that code in the DLL, even if it's not called.

I am having trouble debugging it. I can attach the debugger (VS2022) to the dopus.exe process, but I don't get any debug output and can't set any breakpoints in my code. Is there a way to get debug output?

Only other things I can think of are rebuilding the JPEG XL library from source in case there is some issue there, or writing a little test app that opens my plugin DLL just so I can debug it.

1 Like

Okay, it appears to be a path issue. If I put libjxl.dll in the DOpus program directory it loads. If I put it in the Viewers directory it doesn't. There's probably some obscure option somewhere to fix it...

I'm actually quite impressed that it worked first time after that, and performance seems quite decent even with very large image files. I was expecting to have to do some heavy optimization.

You should use LoadLibrary with the full path to the secondary DLL, if possible. Then there’s no chance for the wrong path of DLL to get loaded.

That or static-link the library into the main DLL, so there is only one DLL. (Often much easier for both you and users.)

If you do have a separate DLL, renaming it to another extension (e.g. LLD like the J2K DLLs in the viewer directory) can be good to avoid Opus trying to load the other DLL as a viewer plugin. (Although this isn’t a huge deal, and we can also add the other DLL to a list of names we skip as we know they aren’t plugins.)

For debugging, breakpoints will work if you attach to dopus.exe. The DLL may not be loaded when you first attach, but the breakpoints will start working as soon as Opus loads your new DLL, and you can set them before attaching.

You can use dopusrt.exe /flushplugins to ask Opus to unload any viewer plugins that are not in use so they can be deleted or replaced. Note: dopusRT.exe not dopus.exe; also, you probably want to replace the plugin with something other than Opus as many actions in Opus will cause it to reload the plugins. I tend to set up a batch file that flushes and replaces the DLL.

1 Like

Thanks Leo. It is working now, but is slow because Windows bitmaps need data in BGRA format and libjxl produces RGBA format. This seems to be an annoying limitation of GDI. I'm sure it can be massively optimised.

I think CreateDIBitmap copies all the data too, so perhaps there is a potential optimisation there if I can use GDI to allocate the pixel buffer and decode directly into it.

For now though at least it works. Thanks again for your assistance.

Copying the bitmap in memory and converting from RGBA to BGRA should be so fast you'd never even notice it. If something is slow, I would not assume that was the cause without measuring where the time is going (unless the conversion code is doing the work in a really strange way, at least).

It should be fast but I couldn't immediately see a way of doing it with GDI to produce the necessary HBITMAP for Opus. I don't really do much Windows stuff in C, I'm mainly embedded and Linux, so I'm just googling stuff at the moment.

I'd like to optimise thumbnails too. JPEG XL has some nice features like progressive decode that prioritises the visually interesting parts of the image, e.g. people and text, meaning you can potentially avoid having to read the entire file if you are only creating a low resolution version.

It's probably possible to decode the file as it is being read in somehow too, I haven't really looked at it. The decoder is at least multithreaded.

If you're using CreateDIBitmap then you must have the data as a byte array in memory, so it's just a matter of going through that array and switching the B and R bytes around.

That's what I've done, but it's only a simple implementation in C. When I find time I'll try to instrument the code and figure out where the bottleneck is. I'm fairly sure MSVC isn't producing optimal code because it rarely ever does. Could switch to LLVM perhaps.

In benchmarks JPEG XL is usually a bit faster than JPEG for decoding. In Opus with a test image I get 280ms for JPEG and 400ms for a losslessly transcoded JPEG XL, so there is clearly some performance to be gained there.

I did some testing and the bulk of the time is spent in the JPEG XL DLL. There isn't really much room for optimization of that code.

I have a feeling my decade old CPU might not be very efficient for JPEG XL, which claims to take advantage of vector instructions where possible.

Another issue I noticed is that generating thumbnails really bogs the machine down. I have DOpus set to use 8 threads for thumbnails (4 cores with hyperthreading), which is great for JPEG and PNG as they seem to be using single threaded decoders. For multi-threaded decoding of JPEG XL it's too much.

Is there anything I can do about that? I could potentially disable multi-threaded decoding when thumbnails are being requested, but it's a question of how to reliably detect that. Maybe just do it when the requested size is 512x512 or less.

You can set DVPFIF_NoMultithreadThumbnails on the plugin to tell Opus to only call it from one thread at a time.

(If there are other image types in the folder, they may still be read in parallel.)

If you're building the JPEG XL DLL yourself, make sure you're testing with a release build as they're much faster than debug builds for this kind of code.

I'll give that a try, thanks.

I have managed to build the DLL but not had a chance to test it out yet. Using a pre-compiled DLL for now. I don't think it's a debug build but I'll check.

Do the thumbnail threads run at normal priority?

They do, at least as far as I can remember. Although the decoder may change that if it's starting threads of its own.

Reading through the libjxl documentation it has a fairly advanced multithreading implementation. The most efficient number of threads depends primarily on the size of the image.

I did a little test with 1 thread in the plugin and 8 threads in Opus, and the performance was consistently better than having multiple threads in the decoder and 1 thread in opus. Quite a lot better.

I guess the best option would be to set the maximum number of threads in Opus, and then have open pass the maximum number of threads that the plugin can use to it. Then if the number of parallel tasks is less than the maximum number of threads it can split them between calls (a simple divide and round down would do, doesn't need to be fancy).

Maybe you could consider that for a future update.

Is the Opus JPEG decoder single threaded?

That's the only reason I can think for for poor performance when generating thumbnails. I put in a simple check for a requested image size below 512x512 pixels, resulting in single threaded decoding, and it's much better than having multithreaded thumbnail generation disabled entirely. It's still poor though, compared to JPEG.

JPEG XL is supposed to be competitive with JPEG for decoding performance. In my tests a sample image takes about 210ms for Opus to decode in JPEG format, and about 280ms with my JPEG XL plugin. A separate test app confirmed that JPEG XL decoding takes about 260ms, the bulk of which is libjxl doing the actual decoding (as opposed to the conversion to the right format bitmap for Opus). JPEG XL decoding was multithreaded.

If the JPEG decoder is not multithreaded then that implies that JPEG XL is taking longer to decode with 8 threads.

This test image is a losslessly converted JPEG too, so it's basically just replacing JPEG's run-length encoding scheme with JPEG XL's more advanced one. It shouldn't be this much slower.

I don't remember it being multithreaded but haven't looked in detail. You can use Task Manager while decoding a JPEG to find out for sure.

You'd normally be generating thumbnails for multiple files at once, one per thread, which makes a multi-threaded decoder for thumbnails somewhat moot (or even detrimental).

I don't think the JPEG decoder is muiltithreaded.

I did some tests with libjxl.

1 thread: 750ms
2 threads: 400ms
4 threads: 300ms
8 threads: 260ms

The library has a function that suggests the ideal number of threads, and it reports 8 for my large test image.

I think this is an issue on the libjxl end, or perhaps with JPEG XL itself. Lossless transcoding of JPEG images is a special case and seems to perform worse than encoding as a lossy JPEG XL.

Thanks for your help.