[Bug 260011] Unresponsive NFS mount on AWS EFS
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Reply: bugzilla-noreply_a_freebsd.org: "[Bug 260011] Unresponsive NFS mount on AWS EFS"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 24 Nov 2021 09:08:04 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260011 Bug ID: 260011 Summary: Unresponsive NFS mount on AWS EFS Product: Base System Version: 13.0-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: ale@FreeBSD.org I'm experiencing annoying issues with an AWS EFS mountpoint on FreeBSD 13 EC2 instances. The filesystem is mounted by 3 instances (2 with the same access patterns, 1 with a different one) Initially I had the /etc/fstab entry configured with: `rw,nosuid,noatime,bg,nfsv4,minorversion=1,rsize=1048576,wsize=1048576,timeo=600,oneopenown` and this after a few days led my java application to have all threads blocked on never returning `stat64` kernel calls, without the ability to even kill -9 the process. After digging it up it seems the normal behavior for hard mount points, even if I fail to understand why one should prefer to have the system completely freezed when the NFS mount point is not responding. So I later changed the configuration with: `rw,nosuid,noatime,bg,nfsv4,minorversion=1,intr,soft,retrans=2,rsize=1048576,wsize=1048576,timeo=600,oneopenown` by adding `intr,soft,retrans=2`. Btw, I think there is a typo in mount_nfs(8), it says to set `retrycnt` instead of `retrans` for the `soft` option, can you confirm? After the change `nfsstat -m` reports: `nfsv4,minorversion=1,oneopenown,tcp,resvport,soft,intr,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2` I wonder why it seems that the timeo,rsize,wsize have been ignored, but this is irrelevant to the issue. After a few days the application on the two similar EC2 instances stopped working again, though. Any command accessing the mounted efs filesystem didn't complete in reasonable time (ls, df, umount, etc.), but I could kill the processes. The only way to recover the situation was to reboot the instances, though. On one of them I've seen the following kernel messages, but they have been generated only when I tried to debug the issue hours later, and only on one EC2 instance, so I'm not sure if they are relevant or helpful: ``` kernel: newnfs: server 'fs-xxx.efs.us-east-1.amazonaws.com' error: fileid changed. fsid 0:0: expected fileid 0x4d2369b89a58a920, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) kernel: nfs server fs-xxx.efs.us-east-1.amazonaws.com:/: not responding ``` The third EC2 instance survived and was still able to access the filesystem, but I think it wasn't accessing the filesystem when there have been the network/nfs issue that affected the two others. -- You are receiving this mail because: You are the assignee for the bug.