How does rpc.lockd know where to send a request
Julian Elischer
julian at elischer.org
Sun Feb 7 05:32:47 UTC 2010
M. Warner Losh wrote:
> In message: <4B6E2B40.1070405 at elischer.org>
> Julian Elischer <julian at elischer.org> writes:
> : M. Warner Losh wrote:
> : > I have a problem. All systems are running freebsd-current form
> : > sometime in the last month, although similar systems running
> : > 8.0-RELEASE exhibit exactly the same problem. rpc.lockd on an NFS
> : > client is doing something that baffles my mind entirely, maybe you can
> : > help. Please bear with me, this is a little complicated, but I wanted
> : > to include all the details.
> : > I have a host, let's call it dune. dune is at 10.0.0.5. dune is also
> : > the master for the carp interface 10.0.0.99. It is running rpc.lockd
> : > and is an nfs server. I've told nfs, rpcbind, lockd and statd to only
> : > listen on address 10.0.0.99.
> : > I have a second host. maud-dib is 10.0.0.8. I do "mount
> : > 10.0.0.99:/dune /dune" on maud-dib. Wireshark shows all the traffic
> : > going to 10.0.0.99. All is happy in the world. When I start, there's
> : > no ARP entry for 10.0.0.5 on 10.0.0.8, nor is there after the mount.
> : > Until I do the following 'lockf /dune/imp/junk ls' (I have write perms
> : > to /dune/imp). At this point, rpc.lockd hangs. I get the message
> : > "10.0.0.99:/dune: lockd not responding" which seems odd. lockd is
> : > really there. However, wireshark shows the NLM traffic going to IP
> : > address 10.0.0.5. maud-dib has no carp interfaces.
> : > That's odd. So my question is 'how does lockd know where to go to
> : > talk the NLM protocol?'
> : >
> :
> : my recollection is that maud-dib will sent an initial packet to dune
> : and dune will respond but that the response may come from 10.0.0.5,
> : after which maud-dib will redirect all requests there, which will not
> : work because dune is not listenning there.
> :
> : teh problem is that dune's daemon is setting a local address of
> : IPADDR_ANY (0.0.0.0) which tells the packets to use a from
> : address that is the address ofthe interface that they exit from.
> :
> : Since 10.0.0.5 is the primary address on that interface, that gets
> : selected.
> : you may try some trickery where you add the .5 address AFTER the .99
> : address so that the .99 is the primary address.
>
> Actually, it looks like this is getting returned, as a ASCII string
> '10.0.0.5' in frame 68 in response to the GETADDR call. Since I've
> told it specifically '-h 10.0.0.99' I'd have thought it would respect
> that. Since it is supposed to be bound to 10.0.0.99, I'd proffer the
> argument this is a bug in rpcbind's implementation of GETADDR.
>
> I never would have thought it would have been returned as an ASCII
> string, but you live and learn, eh?
>
> Now, on to fixing the bug.
>
> Warner
>
> P.S. http://people.freebsd.org/~imp/wireshark.dat has the trace I'm
> referring to (and I've posted it in another message on this thread).
>
> : > I did a packet capture from before I did the mount on maud-dib. I can
> : > see the NFS mount, the NFS traffic, all to 10.0.0.99. I then see an
> : > ARP for 10.0.0.5, followed by the NLM request from 10.0.0.8 to
> : > 10.0.0.5. This gets an ICMP port unreachable message, since I told
> : > nfs, et al, to bind only to 10.0.0.99.
> : > So, I thought, 'the answer is obvious, I'll just look for the packet
> : > that has the string 'dune' in it (which is the hostname of 10.0.0.5).
> : > No packets have that string in it, other than the mount packet which
> : > has /dune in it. Nor is there any DNS activity doing a lookup. Nor
> : > is there any static mapping in /etc/hosts on 10.0.0.8.
> : > Next thought: Oh, somebody like portmapper or the NFS protocol from
> : > 10.0.0.99 is telling 10.0.0.8's rpc.lockd (or something else) to do
> : > locking requests to 10.0.0.5. That's trivial to find, I think to
> : > myself. I'll look for the octets 0a 00 00 05 (hex). The only
> : > instances of that are in the ARP packet, the NLM request and the ICMP
> : > unreachable packets. No other packets includes these bytes. Nor do
> : > any include the reverse.
> : > Right after the mount, there's nothing in the connection table that
> : > points to 10.0.0.5, only 10.0.0.99.
> : > So I'm having a serious WTF moment. How the heck is this even
> : > possible. Any ideas on where to look for where this gets set and/or
> : > communicated?
> : > thanks a bunch for any insight that you can give...
> : > Warner
> : > _______________________________________________
> : > freebsd-net at freebsd.org mailing list
> : > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> : > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> :
> :
try swapping the addresses on the interface.
More information about the freebsd-net
mailing list