Re: Various unprotected accesses to buf and vnode

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Sat, 28 Aug 2021 22:29:44 UTC
On Sat, Aug 28, 2021 at 11:16:15PM +0200, Alexander Lochmann wrote:
> 
> 
> On 28.08.21 22:50, Konstantin Belousov wrote:
> > On Sat, Aug 28, 2021 at 08:53:19PM +0200, Alexander Lochmann wrote:
> > > On 27.08.21 20:40, Konstantin Belousov wrote:
> > > > > - Read of buf.b_blkno in cluster_write():
> > > > > According to the documentation b_lock is needed.
> > > > > Is b_blkno maybe a read-only element of struct buf?
> > > > No, b_blkno is not read-only, it is the disk block number for the block,
> > > > as opposed to b_lbklno which is logical file block number.  The buffer
> > > > is instantiated with b_blkno == b_lblkno, and when the buffer is mapped
> > > > to the real disk block, b_blkno is updated.
> > > > 
> > > > Could you show me the backtrace of the situation where cluster_write()
> > > > is called with unlocked buffer?
> > > > 
> > > > > 
> > > If you don't mind, you can find them here:
> > > https://thasos.cs.tu-dortmund.de/bugs/ctx-buf-b_blkno-r-cex.html
> > I see some banner "Counterexamples", two buttons, and then a blank space.
> > Both under Firefox and Chrome under FreeBSD.
> > 
> > I went as far as to ask and see what happens on Chrome under Windows, it
> > looks the same.
> Oh, I'm sorry.
> Pls click on "Show Member List", and then select an entry. For both cases,
> there should be only one entry.
> (JavaScript must be enabled for this to work.)
Ok, I see some call sequences (?), but again all of them are ffs_write()
(one is ext2_write) calling into cluster_write().  There the buffer lock
is owned.

Show me the specific call sequence where it is not.

> > 
> > > (Pls ignore the line 'Hypothesis 1: ....'.)
> > > > > - Write to vnode.v_bufobj.bo_object:
> > > > > https://github.com/freebsd/freebsd-src/blob/main/sys/vm/vnode_pager.c#L291
> > > > > According to the documentation, '[...] the vnode lock which embeds the
> > > > > bufobj' is needed. However, interlock is taken in line 276.
> > > > > Is the interlock equivalent to the vnode lock?
> > > > > (I assume 'the vnode lock' refers to vnode.v_lock.)
> > > > vnode_pager_alloc() must be called with the vnode exclusively locked.
> > > > This is asserted at the beginning of the function.
> > > > 
> > > Yeah, I see that check: ASSERT_VOP_LOCKED(vp, "vnode_pager_alloc");.
> > > However, our data says otherwise: According to our trace, the shared lock is
> > > taken.
> > > You may have a look at https://thasos.cs.tu-dortmund.de/bugs/ctx-vnode-v_bufobj.bo_object-w-cex.html.
> > > 'EMBSAME(vnode.v_lock[r])' refers to the shared vnode lock.
> > > A click on each lock description, e.g., EMBSAME(vnode.v_lock[r]), leads to
> > > the code where it was taken.
> > > (Pls ignore the line 'Hypothesis 1: ....'.)
Ah, yes, the calls from lookup and open would be with the shared lock.
Still, we lock the vnode interlock to avoid double-allocating the v_object
object, so it is fine.  Some mode of the vnode lock is required nonetheless,
because otherwise we might miss reclaim which guarantees that v_object 
is freed.

> > > > > 
> > > > > - Is buf.b_bufobj a read-only element?
> > > > You should scope the question.
> > > > 
> > > > While buffer is owned by a vnode, b_bufobj is constant.  Since buffers are
> > > > type-stable, they migrate between vnodes as cache finds it required to
> > > > reclaim and reuse.  There, b_bufobj is changed.
> > > > 
> > > I'm referring to getdirtybuf(): msleep(&bp->b_xflags,
> > > BO_LOCKPTR(bp->b_bufobj),PRIBIO | PDROP, "getbuf", 0);
> > And?  The msleep() just uses the address as the sleep chain id.
> > The getdirtybuf() function fails if it really slept, thus dropping bo lock
> > and allowing the buffer identity to change (Really not identity, because
> > vnode is locked, but allowing the buffer to be written out and moved to
> > clean queue).