Progress bar when copying to NFS

Hi,

I have used DO for some years, and it works fine. Recently, I built a NAS for my storage: FreeBSD 13 with ZFS shared by NFSv3.

At begining, I noticed that the nfs client on my Windows10 writes slowly (15-20m/s). Later I found a way to improve it - add a SLOG device to my ZFS pool. I did it. An Intel SSD was set to provide ZIL (SLOG) function. The write speed became faster(45-50m/s tested by Windows Explore).

The problem is coming up.

I mounted NFS successfully on my windows.

First, I copy 100G data in Windows Explorer to NFS mount point, I got continuous 45-50m/s. It is right speed.

Second, when I copy 100G data in DO by click Copy button to NFS mount point, I got 100-110m/s. But, I saw my network out speed is still 45-50m/s in the windows task manager. The progress bar will stop after some distance, and the speed will become 0. I saw in the windows task manager that the disk reading speed also decreased to 0. I can draw a conclusion according to the calculation of network out speed in Windows Task Manager: The copy progress bar seems to be waiting for the NFS client transmitting data that has been identified as completed by the progress bar. Later, the speed back to 100-110m/s and then back to 0 for waiting data transmission.

Finally, I copy 100G data in Total Commander to NFS mount point, I got continuous 45-50m/s. It is right speed.

I want to see: the window shows a continuous speed when I press Copy button to copy files to nfs mount point. The copy window disappeared when the progress bar show finished and meanwile the files transfer completed. Like I just copy data to another local disk.

What should I do? Do you need more information?

The operating system, network, or filesystem must be buffering things up, reporting the data has been written while it's still in the buffer, and then making you (and Opus) wait until it has really finished sending/writing.

It's not something Opus itself is doing, unless you've set the buffer sizes really large within Opus or something like that. (You'd know if you had as those settings are quite out of the way.)

Other software may use different APIs to copy the data, but that doesn't make the API Opus is using wrong, even if some part of the setup seems to be doing extreme buffering in this case. (Although what you describe is quite unusual and I've never seen something do that. But NFS itself is also unusual to use from Windows, so maybe I just haven't seen the same combination of components before.)

FWIW, we'll be providing an option to use other copy APIs with Opus in the future. Some things are only tested with the one API which File Explorer uses and don't behave as well with others, like this. But it's really a flaw in those things, not in Opus.

One more thing.

If I removed my SLOG device on my FreeBSD machine, the progress bar behaves normally. But the write speed is horrible.

Opus has no knowledge at all of what's on the remote machine, and won't do anything differently on its side whether it is there or not, so that's further evidence that the strange buffering is not something Opus is doing.

Yes, that is what I want to say. TC may use other copy APIs.

an option to use other copy APIs with Opus in the future.

And when will DO provide this option? :slight_smile:

In the future. :slight_smile:

It's the same issue I have since forever with the 10G network speed. I have to use Windows Explorer or TC to copy large amounts of data (50GB) in decent amounts of time. Hope this copy problem get's resolved soon.

The root post suggests total copy time/speed is the same and it's just the progress dialog that's misleading.

(It sounds like the bottleneck is the HDD at the other end, not the network. The SSD SLOG in the server provides a faster intermediate device at the other end which the writes can go to before being written to the HDD. That results in data being sent to the server as fast as the network and SSD device can take it, and then a delay as the server commits that data to the actual HDD. The only piece that doesn't fit for me is task manager showing the network speed is still 45-50 MB/s, but it may just be how it averages out, and no network tech I know of is limited to that speed, which is also very slow for a HDD for that matter. In fact, the question I'd be asking is why everything isn't 100+MB/s all the time, since the network and HDD should be able to achieve that.)

I am sure the network out speed shown in task manager is not an average speed, because I can see that my Mikrotik Router shows my Windows10 net port transmissing data with 45-50MB/s.

(And 45-50MB/s is the 4K write speed of my Intel SSD.

SLOG is a log device for a ZFS pool.

Firstly the data is in RAM when the data coming. And then the data is wrote to SLOG device. System report write operation completed.

Finally, about every 5s, the data in RAM is wrote to disk.

If there is no special case, SLOG is only written and not read.

5S means that slog needs to store a small amount of data:

When the dual Gigabit network card transmits at full speed, the data in slog is only 1000mb / s * 5S * 2 = 1GB at most.

It's odd since something must be accepting the data at 100 MB/s or Opus wouldn't be able to write to the file handle at that speed.

Some part of the chain must be able to take the data at that speed and then buffer it up, waiting for the slower parts to catch up.

It won't be Opus doing that buffering, unless the copy buffer sizes in Opus have been increased really high.

I don't know if these data are helpful to you.:slight_smile:

It's an interesting mystery, but we can't do anything really* since the issue isn't on the Opus side of things. :slight_smile: Opus just reports what's happening.

(*Other than provide a way to use the same API that File Explorer uses, which apparently hides whatever is going on, maybe by smoothing out the numbers over a wider time window, or doing something which disables the buffering but wouldn't affect the actual overall speed.)

This reminds me that ZFS uses synchronous write mode by default. However, if I change the synchronous write mode to asynchronous, everything is normal without SLOG.

But, asynchronous I/O..... maybe I need to buy a UPS to make things right.

Thanks a lot :slight_smile:

Correct the calculation:

When the dual Gigabit network card transmits at full speed, the data in slog is only 1000Mbit/s * 5s * 2 = 1Gbit = 1/8 GB = 0.125 GB = 128M

The speed calculation is based on how quickly WriteFile accepts the data, averaged over a short time, since that is the only information Opus has, and the only involvement Opus has in transmitting the data.

From what you describe, the speed is high while the server at the other end accepts data into a fast buffer. The server then blocks and the speed becomes zero while it deals with that buffer. It's zero because WriteFile won't accept any more data during that time. So what Opus is reporting is correct.

If File Explorer averages things out over a much longer time, or doesn't wait for the file to be fully flushed before moving on, or does something to prevent the buffering from occuring, then it might report speeds differently. But what Opus is reporting is what's really happening, at least from Opus's perspective. Opus is reporting how quickly it can write the data to the next component in the chain at the current time.