Optimise Copy

Further to the following thread, requesting various optimisation's that would be nice in DOpus 13, I have spun out into separate threads the various optimisation's and benchmark results.

This thread is about Copy optimisation, where a multi-threaded highly optimised copy engine (as shown with Bvckup2 R81.24.1) is shown to be faster, particularly for SSD's, which only show their true portential with multiple I/O requests in motion at a time - ie. higher queue depth. I haven't found a faster copying engine yet (including FastCopy and TeraCopy). Also, it's interesting to note that Bvckup achieves this high speed despite also copying Alternate Data Stream's and logging copious amounts of data on the copy operation.

DOpus had copying of attributes, metadata, timestamps etc. disabled.

All tests performed on relatively large folder structure of Boost.org C++ libraries which can be easily reproduced.
1) Download boost 1.80.0 libraries for Windows from boost.org (eg. .7z archive) and extract.
2) Build Boost 1_80_0 to also build non-header-only libraries - I did this with VS2022 Community.

Microsoft Antimalware real-time service was disabled during test's, as I find it to interfere with file operations a lot.

All timings in seconds.
Number of runs vary between tests

Test
Copy from/to type...
DOpus 12.28.2 beta Time to Complete Bvckup2 R81.24.1 Time to Complete Explorer xcopy /E/H/Q robocopy /e /mt /NFL /NDL /NJH /NJS /nc /ns /np > nul
C: to D: on same PCIe3.0 M.2 SSD
WD 256GB
148, 149, 150 37, 40, 32, 30, 31 47
C: to W: PCIe3.0 to USB3 2.5" HDD 232, 196, 198 118, 115, 116 321 183, 182 130
W: to W: on same USB3 2.5" HDD
WD 5TB
180, 185 179, 176 260, 291
W: to D: USB3 2.5" HDD to PCIe3.0 SSD 87, 85, 94 30, 31, 33 153
C: to E: PCIe3.0 to USB3 Flash Pen Drive
Samsung Bar Plus v2 64GB
244, 244 175, 174
E: to C: USB3 Flash Pen Drive to PCIe3.0
Samsung Bar Plus v2 64GB
90, 97, 96 30, 36, 25
C: to E: PCIe3.0 to USB3 Flash Pen Drive
Samsung Bar Plus v2 64GB
244, 244 175, 174
C: to S: PCIe3.0 to USB3 Flash Pen Drive
Samsung 850 Pro 1TB
185, 180, 182 96, 96, 96
S: to C: USB3 SATA3 SSD Drive to PCIe3.0
Samsung 850 Pro 1TB
66, 62, 61 26, 27, 27
1 Like

We're going to be moving to CopyFileEx by default in future versions, which will make Opus file copies perform the same speed as File Explorer's (where they don't already i.e. with devices that have some kind of issue with anything that isn't File Explorer's exact copy method, buffer sizes, etc.)

Using anything else doesn't really make sense, and has proven to have wildly different performance on different hardware/drivers/networks/etc.

CopyFileEx will do multiple I/O requests in parallel for the same file.

Oh, man. This takes me back to when I was obsessed with Windows file compression utilities.

In my tests on Windows 10 Enterprise, Explorer performed the worst out of all - I guess CopyFileEx multiple I/O requests only work for the same file, and not for multiple files.

In all cases Bvckup2 performed the fastest - often 2 or 3 times faster than Opus, and more than that compared to Explorer.

File copy has to be rock solid on everyone's systems, and for every scenario (copying 100,000 almost empty files isn't that common, for example, and isn't the thing to optimise for, especially when none of the copy methods actually take very long). It's also very hard to properly test copy speed, due to all the buffering the filesystem does and the huge impact different hardware/software/network configurations can have on what should be minor differences in method and buffer sizes, as well as the types of metadata being preserved, etc.

We also need to have a reliable and understandable interactive error/retry UI, which would be complicated by doing multiple files at once as part of the same copy job. (Although this isn't out of the question, and may eventually come for things like FTP where parallel copies are unfortunately still the only way to solve those protocols' inherent problems with network latency.)

It's possible we'll look into this in the future, but it's not going to come soon, as we're very conservative when it comes to changing and testing the file copying code. It's core functionality, mistakes with it can lead to data loss, and it has to work well on all systems. Experience has shown that a change which seems like a good idea in this area, and might speed up one scenario on one machine, can end up causing more trouble than it is worth. But sometimes it is worth it, too, once evaluated and tested for a long time.

4 Likes

Good reasoning, Leo, it has to work reliably for everyone!

You might want to have Robocopy support as an optional copy command for multi-threaded copying (which CopyFile Ex won't do). As Robocopy ships with Win10/11, having an advanced config or commands for starting Robocopy would be nice. (Yeah, you can do that with scripts and buttons too).

I have also read through the notes of Bckup2 and it looks like their main speedup is by simply doing less per file and more per job in what they call prep/cleanup. You might want to look at their comments here Bvckup 2 | Development notes | 19042018 if there is anything useful. I would guess that the simple task of updating the UI in Explorer eats a lot of cycles, so maybe a "turbo" output window that is updated less often and with less details could help.

Those benchmark sure have some oddities. Especially that copying from one partition to another on the same drive would kill Dopus but not Xcopy? Robocopy multi-threaded can keep up on USB writes suggesting that waits for "done" from the USB controller are the major roadblock (and with eight threads running unless files are really small, you still have seven copies going while waiting for the eighth) - and the Bvckup2 notes suggest that they do not really wait for those up until the very end of a copy operation, which is good and fine until a write really fails? I hope they tested that with faulty hard drives and buggy USB controllers because they also shortcut Delta copies with hashes and if they built those hashes based on a write assumption rather than a full verified read-back...

You can set up Opus buttons (or drag & drop, hotkeys, etc.) to run Robocopy if you want. I think it's far too esoteric to be built-in, though, especially as the progress dialog would be text scrolling by in a Command Prompt and the error/retry/etc. UI would be almost nonexistent, as would support for archives, FTP, and all the other stuff Opus does. :slight_smile:

1 Like

Bvckup2 is used quite extensively in the wild and is considered one of the fastest for multiple scenarions, and is reliable. I believe it uses two different hashes. Verification is a good point, and is coming in the next version, which is in beta.

The disabling MS Antimalware was more related to copying speed, but for reference, I did the same C: to D: test (ie. duplicate on same disk) while it was enabled:

DOpus 261 secs
Bvckup2 159 secs

so it doesn't affect Opus as much, but 100 secs faster is still a lot faster.

As mentioned, Bvckup is also logging a lot of detail in the copy process as well.

Re. DOpus being slowed down by GUI Progress bar updating, I mentioned this before, where it could possibly only update every 1 second or so.

The progress dialog runs on a separate thread. It’s unlikely to affect anything.

I realize no one asked me, but I use what I consider a wonderful Robobcopy configurator. I have built out the robocopy options and used those to create the needed button options. I mostly use them in UNC or network copy situations. So far, I found that Dopus is as fast as all the other current tools (Teracopy, etc)

Cinchoo/ChoEazyCopy: Simple and powerful RoboCopy GUI (github.com)

Anecdotally, I downloaded the trial version of Bvckup2 and not once did it beat Robocopy Mirror on either SSDs or SSD to UNC. I tend to move large files though, so that might have something to do with it. I didn't put all that much effort into making sure it was all apples-to-apples.

4 Likes

I forgot to mention that for Bvckup2 I configured the following options:

Detecting Changes: Re-scan destination
Copying:           Copy files in full
More options:      enabled Alternate Data Streams

The "Detecting Changes" and "Re-scan destination" options are most relevant as they result in snapshot and delta state being generated.

@ThePauler - I'd be interested in a comparison of your use-case with these options set as they aren't default's (the defaults are of more use to speed-up subsequent backups). If you don't normally copy ADS with robocopy then leave that option disabled in Bvckup2 (default is already OFF).

Thanks, @Deipotent; I'll take a look because I'm always looking for the best backup or sync options. I hovered over the purchase button for Bvckup2, but I didn't pull the trigger after testing.
Also, why would I care about ADS if I'm moving things to a UNC or network share that isn't NTFS?

If you don't mind, for transparency - are you affiliated with Bvckup2? Developer or anything? Nothing wrong with that, of course....but I just like to know if there's any additional motivation for preferring a tool.

Also, if this is all not allowed on this forum, I don't want to hi-jack anything. Happy to take this off-line or something.

@ThePauler You wouldn't be worried about ADS if dest doesn't support it - just being thorough.

Transparency-wise - I have no affiliation with Bvckup2, and receive no benefit from mentioning it - been using it since the original v1 beta release, and haven't found anything faster for sync's with with simple UI + logging. Developer is also responsive.

I just did a quick re-test with multi-threaded robocopy from C: to D: (ie. on same PCIe M.2 SSD), and you're right in that it was slightly faster. I got around 25 seconds.

So, for a copy/duplicate on same SSD, robocopy /mt is the fastest.

Given the slight extra overhead with updating GUI and logging done by Bvckup, I would probably estimate the copying engine speeds to be roughly the same on duplicate operation on same SSD.

I'll do some further testing different workload's (eg. larger files, mixture etc.), when time permits, and update the table in the first post.

1 Like