Multithreaded copying

Hi :slight_smile:

I just wonder if file copying in directory Opus are single threaded or multithreaded. I heard that Windows only have single threaded file copy, but software such as Robocopy has multithreaded file copy. Multithreaded file copying drastically increases speed when copying since it can copy multiple files at the same time using multiple threads

Blockquote

** Itโ€™s faster and more reliable.*
** It can be restarted*
** You can run the same copy multiple times to verify everything got copied *
** It can be multithreaded for tons of small files. *
** It has logging capabilities. You can set it up on a schedule to auto copy when it sees changes. It can be used to fix permissions on the receiving side. You can use it to keep poor mans backups by not using /mir. You can copy folder and file time stamps to the original. Etc etc. itโ€™s extremely powerful.*

(source: Reddit - Dive into anything )

Blockquote

I mean, if multithreaded file copying was integrated into dopus, it would be yet another bigtime reason for ppl to use this awesome software of yours. "

So...is there any plans for multithreaded file copy in dopus?

We experimented with that a long time ago and it made no difference at all. File copying is almost always bottlenecked by hardware controller, storage and network speeds (unless antivirus is misbehaving). Doing two copies in parallel doesn't speed things up; in fact, it'll slow things down in a lot of cases. This is true with both large and small files, at least with NTFS. (NTFS won't create two files in parallel; it'll serialise the two requests, at least from my own testing some years ago.)

Why Robocopy has that option, I do not know, but it's off by default as well. I suspect it's there for a very specific situation.

Most of the things you quote about Robocopy are unrelated to the multithreaded option it has, too.

I became aware of Robocopy because Linus (Linus Media Group) proved that multi-threaded file copy was faster than normal file copying (which is single threaded in Win).
But then again, he was using a 10Gbit link. Might have something to do with it :smiley:

Anyway, you say you guys ALREADY HAVE tried it, and you decided not to use it. And i bet that was for a good reason, some of them you shared in your answer.
That's good enough for me :slight_smile:

Thanks for the reply to my question, and have a nice weekend :slight_smile:

1 Like

I'd have to see the benchmarks, but it would depend a lot on what is at both ends (Windows, Linux, etc., and which filesystems) as well as the network and storage devices, PCIe / SATA versions, etc.

I find it surprising though since you can usually see that copies go at the maximum speed of the slowest component in the chain, and you obviously can't go any faster than that.

Doing two copies at once may help with high latency networks but that's the opposite of a typical 10Gbit network. Maybe if there were very fast SSDs at both ends, but even then I'm guessing the difference is small, and it'll make other things slower, as well as massively complicate things, including the UI for the user (something Robocopy doesn't have to worry about as it has no UI and just skips/retries files automatically if there's an error). A small % increase for a scenario that is already blazing fast wouldn't be worth those trade-offs, if that's what we're talking about.

(Measuring copy speed is also a lot more complex than it often seems, due to different things reporting speeds or completion times differently, while something may just be written into a buffer and not yet written to disk or even sent over the network yet, depending on the program doing the copying and how it's being measured. And if lots of small files are being copied, differences often really come down to which metadata different programs/configurations copy for each file, not to details of how the file bodies are copied.)

You can always run two copy jobs in parallel, anyway.

(It's also possible that "multithreaded copying" doesn't mean two files at once but offloading the read/write processes for each file to separate threads or using overlapped I/O, which is quite a different thing. Opus can do that already, to a degree, and we'll be adding some more options in that area in the future.)

1 Like

Yes, it is about distributing file copying on multiple threads in the CPU :slight_smile:
Which threads did you think i was talking about? :slight_smile:

There are several ways you can use multiple threads for file copying. I think we may be talking at different levels of detail here. :slight_smile:

I guess, but I am to stupid to be able to provide you with the correct answer with the sufficient amount of detail :smiley:

Anyway, you gave me an answer; you HAVE considered it earlier but you concluded that it was no point.
That's what i hoped for anyway, for you guys to consider it :slight_smile:
So thank you for a good answer :slight_smile:

I've goggled it and found more info about it here.
Is it only a command line tool or a GUI one?
I am already using tera copy from code sector with DO.

Robocopy is a command line tool.

You can make buttons/hotkeys in Opus which run it on the current folders. But I would experiment with it in a command prompt first to verify it actually is faster for your scenario, as it's unlikely to be.

It can do lots of other useful things as well, though. It's the kind of tool that is occasionally useful for doing esoteric things (like duplicating junctions instead of copying their contents).

1 Like

Multi-threaded file copying doesn't make a lot of sense if you are hitting the max read / write speed of either source or destination with just a single operation. That said, it would make sense to multi-thread in cases where you have operations that are working on different drives altogether or in cases where the source drive can support a read speed much higher than the destination's write speed, like copying files from an SSD to 2 different HDDs.

1 Like

You'd just start two copies for that, which would already run in parallel.

Multi threaded copying - better to have it and need it now and then, than to need it and don't have it.
Wouldn't it be nice if Linus started using DOpus because of awesome m.threaded copy abilities (among many other awesome features)? In any case, he should get a free copy of DOpus for review. Not many ppl know about this gem

1 Like

If Linus wants a free copy we'd certainly give him one.

I was an avid Robocopy user about ten years ago when I was in Europe and needed to copy LOTs of small files from directories on a Windows server in USA. The speed gains were basically because of all the overhead for initiating a single file copy would have other ongoing file copies using the bandwitdth so there ALWAYS was a file transfer, even if the target machine was just flushing diskcache. The same happens today with RCLONE when I need to copy many small(ish) files from the cloud to a target machine. Doing those in parallel helps because of all the small overhead of each single transfer. But there is absolutely no gain at all in LOCAL copies unless you have mechanical disks with no internal caching and an OS that is really good at caching/reordering writes to the physical structure of the disk to minimize seeking.
The other thing that made Robocopy great was unlimited retrying and continuing to copy even after one file failed. The last part Opus does really well with "Unattended Copy" which I am using every second day.

So, @docbobo please, you recommend RoboCopy or stick to Directory Opus?

I have not used Robocopy in years. As said, I do everything with Opus and large transfers to/from cloud storage (Onedrive, Amazon) with Rclone.

If you're already transferring files at the limit of the hardware involved then there's no reason to change anything.

If not, try them and see. It will depend on your setup, and you're the only person who can measure how different methods work on your setup.

In general, though, the whole thing is likely not worth worrying about, unless you have an unusual setup.

And when measuring things, remember to repeat each test several times, as caching and other factors can have a huge impact on each run's performance.

You will likely spend more time benchmarking this kind of thing than you will ever save from any particular speed-up.

2 Likes

Uh, I don't think that is the way it works...
Usually, companies send him stuff to test because it is good publicity to let his 14.000.000 followers know about your product and especially if Linus likes it :slight_smile:

But yes, one can also sit passively and wait :slight_smile: That is not considered a wise business strategy., though
Anyway, I just thought I should mention it. :slight_smile:

Linus tends to cover hardware, at least from what I've seen. This is a little off-topic, though.

2 Likes

Worst you get is a NO.
Best you get is increased sales.....if that is interesting.
However, there is a 100% chance that nothing will happen if one don't try :wink:
I shouldn't have to tell you this :slight_smile: