atomic_load_acq_int in sequential_heuristic

Konstantin Belousov kostikbel at gmail.com
Mon Aug 25 07:34:11 UTC 2014


On Mon, Aug 25, 2014 at 02:57:00AM +0200, Mateusz Guzik wrote:
> On Sun, Aug 24, 2014 at 07:42:36PM +0300, Konstantin Belousov wrote:
> > On Sun, Aug 24, 2014 at 07:23:31PM +0300, Konstantin Belousov wrote:
> > > On Sun, Aug 24, 2014 at 01:57:29PM +0200, Mateusz Guzik wrote:
> > > > Writer side is:
> > > > fp->f_seqcount = (arg + bsize - 1) / bsize;
> > > > do {
> > > > 	new = old = fp->f_flag;
> > > > 	new |= FRDAHEAD;
> > > > } while (!atomic_cmpset_rel_int(&fp->f_flag, old, new));
> > > > 
> > > > Reader side is:
> > > > if (atomic_load_acq_int(&(fp->f_flag)) & FRDAHEAD)
> > > > 	return (fp->f_seqcount << IO_SEQSHIFT);
> > > > 
> > > > We can easily get the following despite load_acq:
> > > > CPU0				CPU1
> > > > 				load_acq fp->f_flag
> > > > fp->f_seqcount = ...
> > > > store_rel fp->f_flag
> > > > 				read fp->f_seqcount
> > > > 				
> > > > So the barrier does not seem to serve any purpose.
> > > It does.
> > > 
> > > Consider initial situation, when the flag is not set yet. There, we
> > > do not want to allow the reader to interpret automatically calculated
> > > f_seqcount as the user-supplied constant.  Without barriers, we might
> > > read the flag as set, while user-provided value for f_seqcount is still
> > > not visible to processor doing read.
> > That said, I think now that there is a real bug.
> > 
> > If we did not read the FRDAHEAD in sequential_heuristic(), we may
> > override user-supplied value for f_seqcount.  I do not see other
> > solution than start to use locking.
> 
> Right.
> 
> How about abusing vnode lock for this purpose? All callers of
> sequential_heuristic have the vnode at least shared locked.
> 
> FRDAHEAD setting code locks it shared. We can change that to exclusive,
> which will close the race and should not be problematic given that it is
> rather rare.
> 
> diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
> index 7abdca0..643920b 100644
> --- a/sys/kern/kern_descrip.c
> +++ b/sys/kern/kern_descrip.c
> @@ -762,18 +762,18 @@ kern_fcntl(struct thread *td, int fd, int cmd, intptr_t arg)
>  		}
>  		if (arg >= 0) {
>  			vp = fp->f_vnode;
> -			error = vn_lock(vp, LK_SHARED);
> +			error = vn_lock(vp, LK_EXCLUSIVE);
>  			if (error != 0) {
>  				fdrop(fp, td);
>  				break;
>  			}
>  			bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize;
> -			VOP_UNLOCK(vp, 0);
>  			fp->f_seqcount = (arg + bsize - 1) / bsize;
>  			do {
>  				new = old = fp->f_flag;
>  				new |= FRDAHEAD;
>  			} while (!atomic_cmpset_rel_int(&fp->f_flag, old, new));
Do we still need rel there ?

> +			VOP_UNLOCK(vp, 0);
>  		} else {
>  			do {
>  				new = old = fp->f_flag;
> diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
> index f1d19ac..98823f3 100644
> --- a/sys/kern/vfs_vnops.c
> +++ b/sys/kern/vfs_vnops.c
> @@ -438,7 +438,8 @@ static int
>  sequential_heuristic(struct uio *uio, struct file *fp)
>  {
>  
> -	if (atomic_load_acq_int(&(fp->f_flag)) & FRDAHEAD)
> +	ASSERT_VOP_LOCKED(fp->f_vnode, __func__);
> +	if (fp->f_flag & FRDAHEAD)
>  		return (fp->f_seqcount << IO_SEQSHIFT);
>  
>  	/*

I believe the patch is correct.

Two notes.  First, please add a comment explaining which other part
of the code is locked against in F_READAHEAD switch case.  Second,
should the vnode lock cover the FRDAHEAD reset case too, at least
for consistency ?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20140825/f4aa2aa1/attachment.sig>


More information about the freebsd-hackers mailing list