Slow copy speed on fast SSD setups

Short: I've been moving files from an SSD raid setup to a Samsung 950 M.2 NVMe drive. From a hardware perspective, the copy speed should be at least 200MB/s, but DOpus is maxing out at 40MB/sec (and often half that.) I believe the main cause is that DOpus copy is designed with old spinning disks in mind and does not scale properly to support SSDs.

Long:

My configuration consists of two 240GB Samsung 850s in RAID 0 and a new 500GB Samsung 950 M.2 NVMe drive. The 850s max out around 1GB/sec reading, but with 4K blocks that drops down to about 300MB/sec. The 950 maxes out at 2.5GB/sec read and 1.5GB/sec write (verified with speed tests on my system, so it's not just theoretical specs.) The files I've been moving were all pretty small, so they're all 4K block moves, with the speed dropping down to 20-25 MB/sec for the small files. I'm pretty certain that if DOpus could support multi-threaded copies (i.e. copy 10 files at the same time), this speed would go up significantly. Unlike spinning disks which would seek themselves to death, the SSDs can handle high IOPS loads and should come into their own as the IOPS increases. (Especially NVMe drives, which have much better queue handling than anything SATA based.)

Alas, most of the files have been copied and it doesn't matter much to me, but as SSDs/NVMe are more and more replacing spinning disks, it would be nice if DOpus could update copy/move to take advantage of this. (Unless I missed some config which already supports this.)

The speed measurement is fairly meaningless for small files because most of the overhead is in creating the files, setting timestamps, and so on. Writing the actual file data is negligible compared to creating the file record itself, if the file is only 4KB.

So nothing described so far sounds like a problem, if it's just that a number on the screen was lower than expected, due to copying small files.

I have just migrated all my files to a bigger hard drive. Some ~12MB Folders with thousands of .html, .ini or similar tiny files virtually took me longer to copy over than 40 gigs of solid movie videos.

Yep, Windows/NTFS is very slow at creating files compared to writing data into files.

[quote="leo"]The speed measurement is fairly meaningless for small files because most of the overhead is in creating the files, setting timestamps, and so on. Writing the actual file data is negligible compared to creating the file record itself, if the file is only 4KB.

So nothing described so far sounds like a problem, if it's just that a number on the screen was lower than expected, due to copying small files.[/quote]

Hence the suggestion to support multi-threaded copy. The performance tanks because of the OS / FS overhead. With spinning disks, running 10 copies in parallel to get around that bottleneck wasn't really possible due to the drive seek, but when dealing with SATA/NVMe SSDs it makes a lot of sense.

Have you proven that with code and benchmarks or is it a theory/assumption?

I've observed that other disk activities seem to have no impact on the slow copy. I'll take a stab at testing/"proving" this tonight. I'm thinking that if I turn off queueing and copy the same set of small files to multiple locations on the NVMe drive at the same time, I'll get a pretty good idea of how that scales.

Hi,
Regarding slow copy speeds, I was a bit disappointed recently, while trying to use dopus for some quite large transfers across a 10Gb/s network from local SCSI disk system.

Dopus 12.3 was totally outperformed by Win 2k8r2 srv (default install)
Windows explorer achieved a peak of 689MB/s and average above 580MB/s
Dopus peaked at just 220MB/s and average was so low we didn't bother waiting for it to calculate.

The files are quite large, at around 130GB and up.
Number of files is significant, so Dopus would be preferred for features like good unattended operation and control.. :slightly_frowning_face:

What could be the bottle neck? Poorly pre-configured buffer size or something?

--
Corpus

Did you copy the same files in both cases?

Reboot between tests to ensure caching wasn't in play?

Did you time the copies with a watch/clock, or just look at the reported speeds (which might be inconsistent between programs)?

Checked what Task Manager or Resource Monitor show is happening to disk and network activity, since some programs close the progress dialog before the data has really finished being written to disk?

Checked with antivirus turned off (at both ends, if copying iver a network) in case it's scanning what one program does more than the other?

Under Preferences / Miscellaneous / Advanced, what are thr current settings for copy buffer and for non-buffered I/O threshold? You could try experimenting with those.

Hi,
Yes same file, several times.
Other files several times.

Both time estimate in dopus and the actual copy time seems consistent and it take much-much more time with Dopus.

This was a default Dopus install on a plain w2k8 server.
No tweeking done with either cache or buffer, but at the same time that should not play any difference.

If windows explorer can achive over 600MB/s on this system, why is a default Dopus install limited to 1/3 of that on the same system?
I even tried ambulating betwen Windows Explorer and Dopus to confirm the experienced result was consistent.

--
CorPuS

Show us some graphs, please.

I've only got one very fast SSD, so the test I've done in involves getting a large (6.44GB) file fully cached and then seeing how quickly Opus and Explorer can write that file to the fast SSD.

Probably not exactly the same test as you're doing (I'd probably need to buy additional hardware to do that), but the graphs demonstrate the kind of thing you need to look at when comparing copy speed between two programs with any type of test.

Here is the Task Manager graph for copying the file twice in Opus (C is the destination drive; source drive is largely irrelevant as the whole file was in the filesystem cache for all the tests):

Here is the same with Explorer:

Note that Explorer closed its progress dialog and said it was completed where I've added the two red lines on the graph. Explorer closes its progress dialog about half-way through the copy. Explorer is made to look faster than it is. On the other hand, Opus won't close the progress dialog until the data is completely flushed to disk. Opus is made to show you what is really happening, and so when it says something is done it is really done.

You can see the copies take about the same time in both programs, for this particular test.

I think the smaller peaks after the main large ones are not part of the copy, but due to other things reacting to the new file. In either case, they are about the same between the two programs, with a fair amount of variance between tests.

What kind of graphs do you see on your system?

The buffer settings can make a huge difference on some hardware. Some devices and drivers are very sensitive to buffer size, and may only be tested with what Explorer does (which is not documented/guaranteed in general, and not always the same buffer size for all types of devices).

The non-buffered I/O setting can also have a big effect, since it can bypass the filesystem cache. That can speed things up, or slow things down, depending on what is being tested. (It can also make things fail entirely on some devices/filesystems which don't implement non-buffered I/O properly, which is why it's turned off by default.) Another side-effect of non-buffered I/O is that you'll get parallel read/write threads, if copying between separate devices, and much larger buffer sizes if copying between the same device. So it may speed up copying between two very fast devices.

The graphs above were done with the default buffer sizes, for what it's worth. (Not what I usually use myself, FWIW.)

1 Like

Hi Leo,
Thanks for the thorough reply.
I'll try to test this when the system becomes available (since this is a server, it's not always available for testing and have now also been migrating data for the last two weeks) :slightly_smiling_face:

--
CorPuS

Hi Leo,

I might get access to do testing on the system either next week or the week after.
Do you have any suggestions to what cache options/settings I should try to tweak and where I find them?

--
CorPuS

Check what the graphs show first with default settings.

Other settings to try, under Preferences / Miscellaneous / Advanced:

  • copy_nonbufferio_threshold = 1 MB (try this on its own first)
  • copy_buffer_size = 64 KB, 512 KB (should be the default), 1 MB, 16 MB (try these with and without the other setting being 1 MB vs default).