cvs commit: src/sys/fs/deadfs dead_vnops.c src/sys/kern vfs_lookup.c

Konstantin Belousov kib at FreeBSD.org
Mon Jan 22 11:25:22 UTC 2007


kib         2007-01-22 11:25:22 UTC

  FreeBSD src repository

  Modified files:
    sys/fs/deadfs        dead_vnops.c 
    sys/kern             vfs_lookup.c 
  Log:
  Below is slightly edited description of the LOR by Tor Egge:
  
  --------------------------
  [Deadlock] is caused by a lock order reversal in vfs_lookup(), where
  [some] process is trying to lock a directory vnode, that is the parent
  directory of covered vnode) while holding an exclusive vnode lock on
  covering vnode.
  
  A simplified scenario:
  
  root fs                                 var fs
  /               A                       /    (/var)     D
  /var            B                       /log (/var/log) E
  vfs lock        C                       vfs lock        F
  
  Within each file system, the lock order is clear: C->A->B and F->D->E
  
  When traversing across mounts, the system can choose between two lock orders,
  but everything must then follow that lock order:
  
        L1: C->A->B
                  |
                  +->F->D->E
  
        L2: F->D->E
               |
               +->C->A->B
  
  The lookup() process for namei("/var") mixes those two lock orders:
  
      VOP_LOOKUP() obtains B while A is held
      vfs_busy() obtains a shared lock on F while A and B are held (follows L1,
      violates L2)
      vput() releases lock on B
      VOP_UNLOCK() releases lock on A
      VFS_ROOT() obtains lock on D while shared lock on F is held
      vfs_unbusy() releases shared lock on F
      vn_lock() obtains lock on A while D is held (violates L1, follows L2)
  
  dounmount() follows L1 (B is locked while F is drained).
  
  Without unmount activity, vfs_busy() will always succeed without blocking
  and the deadlock isn't triggered (the system behaves as if L2 is followed).
  
  With unmount, you can get 4 processes in a deadlock:
  
       p1: holds D, want A (in lookup())
       p2: holds shared lock on F, want D (in VFS_ROOT())
       p3: holds B, want drain lock on F (in dounmount())
       p4: holds A, want B (in VOP_LOOKUP())
  
  You can have more than one instance of p2.
  
  The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and
  MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs
  servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode.
  
  - Tor Egge
  
  To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp
  is actually not used by the callers of namei. Thus, placeholder deadfs
  vnode vp_crossmp is introduced that is filled into ni_dvp.
  
  Idea by:        ups
  Reviewed by:    tegge, ups, jeff, rwatson (mac interaction)
  Tested by:      Peter Holm
  MFC after:      2 weeks
  
  Revision  Changes    Path
  1.50      +24 -1     src/sys/fs/deadfs/dead_vnops.c
  1.97      +17 -4     src/sys/kern/vfs_lookup.c


More information about the cvs-src mailing list