Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor read performance with network backend #92

Closed
j-vizcaino opened this issue Mar 15, 2017 · 28 comments
Closed

Poor read performance with network backend #92

j-vizcaino opened this issue Mar 15, 2017 · 28 comments

Comments

@j-vizcaino
Copy link

j-vizcaino commented Mar 15, 2017

Use case

  • /tmp/encrypted is an Amazon Cloud Drive folder, mounted read-only using rclone
  • /tmp/clear is the gocryptfs deciphered version of the above.

Copying a 300MB file (ciphered) from /tmp/encrypted, I get around 5-6MBps, which is OK, considering my bandwidth is 10MBps.
Copying the same file from /tmp/clear yields extremely poor speeds (from 100KBps to 1MBps).
Speed is not limited by the CPU.
Tracing read requests in rclone shows read sizes from 4KB to 128KB occuring at low frequency (thus low througput)

Any advice would be greatly appreciated.

Environment

  • Debian Stretch, x86_64
  • gocryptfs 1.2.1
  • Ciphered volume has the following features:
    • version: 2
    • flags: GCMIV128, DirIV, EMENames, LongNames, AESSIV
  • rclone 1.35, mount max-read-ahead set to 16MB
@j-vizcaino j-vizcaino changed the title Poor read performance with slow backend Poor read performance with network backend Mar 15, 2017
@j-vizcaino
Copy link
Author

j-vizcaino commented Mar 16, 2017

After some extra tests it seems that setting the max-read-ahead for rclone has not effect since reads are 128KB max.
Swapped rclone mount with acd_cli and performance is almost the same:

  • reading directly from acd_cli mount gives around 10 MBps
  • reading from deciphered gocryptfs mount gives bursts at 2 MBps with stalls of more than 3 seconds :(

How does gocryptfs handles fadvise? Does it propagates these flags to the underlying FS when opening the crypted file?

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017

Looks like gocryptfs is doing something that's pretty slow on Amazon Cloud Drive. I don't really known how ACD works, but can you give me access to an ACD folder filled with garbage to examine this?

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017

PS: fadvise: no, fadvise is completely ignored.

@j-vizcaino
Copy link
Author

j-vizcaino commented Mar 16, 2017

@rfjakob I cannot delegate part of an ACD folder sadly has mount is all-or-nothing. IIRC, ACD has a free plan though and it's super easy to set it up.
Regarding fadvise, please discard my comment as, digging deeper I found out that the syscall is handled by the page cache of the kernel (so no FS interaction or support is needed)

@j-vizcaino
Copy link
Author

j-vizcaino commented Mar 16, 2017

@rfjakob I think I found what causes the problem: reads issued by gocryptfs are not sequential. When running rclone mount with debug information, I see a lot of seek calls. It seems gocryptfs is "jumping around" in the file, therefore the read-ahead cannot work properly and throughput is badly affected by the read latency.

EDIT: rclone code is causing the excessive calls to seek. Maybe this is related to this part of the code where reads must fulfill a page cache.
Considering that gocryptfs adds 32 bytes every 4 KB (size of page), the read is misaligned.
@ncw Could you shed some light in this?

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017 via email

@j-vizcaino
Copy link
Author

2017/03/16 11:18:24 aaa/gocryptfs.diriv: File.Open
2017/03/16 11:18:24 aaa/gocryptfs.diriv: ReadFileHandle.Read size 4096 offset 0
2017/03/16 11:18:24 aaa/gocryptfs.diriv: ReadFileHandle.Read OK
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.Read OK
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.Read size 4096 offset 8192
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.seek from 135168 to 8192
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.Read OK
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.Read size 4096 offset 135168
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.seek from 12288 to 135168
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.Read OK
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.Read size 4096 offset 12288
2017/03/16 11:18:25 aaa/bbb/ccc: ReadFileHandle.seek from 139264 to 12288
2017/03/16 11:18:26 aaa/bbb/ccc: ReadFileHandle.Read OK
2017/03/16 11:18:26 aaa/bbb/ccc: ReadFileHandle.Read size 4096 offset 139264
2017/03/16 11:18:26 aaa/bbb/ccc: ReadFileHandle.seek from 16384 to 139264
2017/03/16 11:18:26 aaa/bbb/ccc: ReadFileHandle.Read OK
2017/03/16 11:18:26 aaa/bbb/ccc: ReadFileHandle.Read size 4096 offset 16384
2017/03/16 11:18:26 aaa/bbb/ccc: ReadFileHandle.seek from 143360 to 16384
2017/03/16 11:18:27 aaa/bbb/ccc: ReadFileHandle.Read OK

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017 via email

@j-vizcaino
Copy link
Author

j-vizcaino commented Mar 16, 2017 via email

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017

The other process may be the kernel performing readahead on it's own. Can you try mounting gocryptfs with

-ko max_readahead=0

@j-vizcaino
Copy link
Author

No luck there :'(

Adding -ko mount options: [max_readahead=0]
/bin/fusermount: mount failed: Invalid argument
Mount failed: fusermount exited with code 256

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017 via email

@ncw
Copy link

ncw commented Mar 16, 2017

I know what this problem is...

It is almost certainly this issue: hanwen/go-fuse#140 (reported by me!)

I was unable to fix this with go-fuse, but https://github.com/bazil/fuse doesn't seem to have this problem so I never switched over to go-fuse in rclone.

I tried to fix go-fuse but I didn't succeed and I couldn't convince the author that it was anything other than expected behaviour.

@j-vizcaino
Copy link
Author

@ncw Thank you for your expertise! It seems most likely related to go-fuse delivering read requests out-of-order.

@rfjakob What do you think about moving from go-fuse → bazil.org/fuse? If it is not an advanced task (I'm a Go noob), I can give it a try.

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017

I tested reading a 10MB file from tmpfs and graphed the read offsets. singleReader, as mentioned by @ncw does not seem enough to get the accesses to serialize.

diag

@j-vizcaino No, switching to bazil-fuse is not an option. gocryptfs uses lots of the infrasctructure that go-fuse provides.

@rfjakob
Copy link
Owner

rfjakob commented Mar 16, 2017

However, looking at this

$ strace -e read cat zero 
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \5\2\0\0\0\0\0"..., 832) = 832
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072

we see that "cat" never reads more than 131072 bytes in one call, which is the maximum that FUSE can handle in one call. I initially thought the kernel splits bigger requests into 128KB chunks and sends all requests in parallel. But that is not what seems to be happening.

Is it kernel readahead?

@j-vizcaino
Copy link
Author

That's a real shame, rclone and gocryptfs made a perfect match for ciphered ACD volume :(
If the problem is indeed related to go-fuse "funky" read reordering, then no network/cloud mount point will ever work decently with gocryptfs.

switching to bazil-fuse is not an option. gocryptfs uses lots of the infrastructure that go-fuse provides.

@rfjakob Can you give a bit more information about that?

@rfjakob
Copy link
Owner

rfjakob commented Mar 17, 2017

It does not look like go-fuse's fault, and I think the issue is fixable through a number of ways.

As for more information about go-fuse vs bazil-fuse, I'll quote an email to another developer asking for advice:

Choosing a FUSE lib for gocryptfs has not been a decision, it has been a journey ;)

But I'll cut the long story short: Use go-fuse. It is a pure Go reimplementation as well (it does not use libfuse).
It has a path-based API (pathfs). This is pretty big because it is hard to get right. Lots of testing ( https://github.com/rfjakob/fuse-xfstests ) and fixing went into go-fuse during the development of gocryptfs. By now it is pretty solid.

bazil-fuse leaves the path mapping to the app. There is cluefs ( https://github.com/airnandez/cluefs ) which is a loopback filesystem built on top of bazil-fuse, so this may be worth a look. I don't know if it has seen much testing though, because I was the one to notice this issue: airnandez/cluefs#3

@j-vizcaino
Copy link
Author

Thank you for the explanation: it now makes perfect sense.
Feel free to ask if I can help you testing this: gocryptfs is a nice tool and I would really like to be able to use it with my current setup.

@ncw
Copy link

ncw commented Mar 17, 2017

It does not look like go-fuse's fault, and I think the issue is fixable through a number of ways.

I would be interested in your thoughts here so I can apply them to rclone!

bazil-fuse leaves the path mapping to the app.

Assuming I'm understanding the terminology correctly, bazil-fuse has a fuse/fs module which does path mapping.

I've implemented both fuse systems for rclone - both using the higher level path interface. I started with bazil-fuse which works well, but I thought I would try go-fuse as I thought it might be higher performance - alas because of the issue above development stalled on it.

@rfjakob
Copy link
Owner

rfjakob commented Mar 17, 2017

I think the problem with bazil-fuse's path mapping is that it does not track renames. This means just doing

touch foo
mv foo bar
stat bar

fails. Unless you implement rename tracking yourself.

At least it was like this when I reported airnandez/cluefs#3 . Cluefs has now implemented it's own rename tracking.

rfjakob added a commit that referenced this issue Mar 18, 2017
Due to kernel readahead, we usually get multiple read requests
at the same time. These get submitted to the backing storage in
random order, which is a problem if seeking is very expensive.

Details: #92
@rfjakob
Copy link
Owner

rfjakob commented Mar 18, 2017

I think I got it. @j-vizcaino can you pull the "hkdf" branch, build, and mount with the "-serialize_reads" option?

-serialize_reads:
diag_serialize_reads

rfjakob added a commit that referenced this issue Mar 18, 2017
Due to kernel readahead, we usually get multiple read requests
at the same time. These get submitted to the backing storage in
random order, which is a problem if seeking is very expensive.

Details: #92
@rfjakob
Copy link
Owner

rfjakob commented Mar 18, 2017

I managed to get the ACD 3-months-trial going.

gocryptfs default:

10485760 bytes (10 MB, 10 MiB) copied, 78,4737 s, 134 kB/s

gocryptfs with "-serialize_reads":

10485760 bytes (10 MB, 10 MiB) copied, 3,60109 s, 2,9 MB/s

directy from the rclone mount:

10567698 bytes (11 MB, 10 MiB) copied, 3,60769 s, 2,9 MB/s

PS: I have just merged the hkdf branch into master. So please just pull and test master.

@j-vizcaino
Copy link
Author

j-vizcaino commented Mar 18, 2017 via email

@rfjakob
Copy link
Owner

rfjakob commented Mar 18, 2017

@ncw If you want to take a look, the logic is in https://github.com/rfjakob/gocryptfs/blob/master/internal/serialize_reads/sr.go .
When I get a read request, I call serialize_reads.Wait(). This blocks for at least 500us to wait for other requests. If we have more than one, the requests are unblocked one by one, lowest read offset first.

I have tried several simpler schemes as well, like delaying out-of-order requests for a few milliseconds, but this did not get them sorted in all cases, as the kernel submits read-ahead reads pretty aggessively.

Oh and I also tried disabling read-ahead. This works and keeps everything nicely ordered and single-threaded, but it cuts down the read request sizes to 4KB (1 page).

@j-vizcaino
Copy link
Author

-serialize_reads option works perfectly!
Maybe this should be documented properly or even turned on by default if the performance penalty cannot be perceived with SSD or HDD.
Thanks again.

@rfjakob
Copy link
Owner

rfjakob commented Mar 19, 2017

I have documented the option in MANPAGE.md.

Unfortunately the ordering and serialization forces gocryptfs to wait quite a bit, so I can't make this a default. On my machine, using -serialize_reads limits the throughput to 70MB/s where I usually get 140.

@rfjakob rfjakob closed this as completed Mar 19, 2017
@ncw
Copy link

ncw commented Mar 19, 2017

@rfjakob thanks for writing that up. Very interesting about the read ahead setting. I'll warm up my go-fuse branch and have a go with your logic.

ncw added a commit to rclone/rclone that referenced this issue Mar 21, 2018
ncw added a commit to ncw/rclone-archive that referenced this issue Jan 18, 2020
hanwen pushed a commit to hanwen/go-fuse that referenced this issue Aug 2, 2021
We enable FUSE_CAP_ASYNC_READ per default, which means that
the kernel can (and does) submit multiple concurrent out-of-order
read requests to service userspace reads and kernel readahead.

For some backing storages, like Amazon Cloud Drive, out-of-order
reads are expensive.

gocryptfs has implemented a delay-based workaround with its
`-serialize_reads` flag for this case
(see rfjakob/gocryptfs#92 for details).

Not enabling FUSE_CAP_ASYNC_READ makes the kernel do this for us,
as verified by adding debug output to gocryptfs, so expose it as
a mount flag in MountOptions.

Fixes: #140
Graphs-at: #395
Related: rfjakob/gocryptfs#92
Change-Id: I10f947d71e1453989c4a9b66fbb0407f7163994f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants