[PATCH] fix integer overflow in txg_delay()

Pawel Jakub Dawidek pjd at FreeBSD.org
Tue Aug 2 06:30:15 UTC 2011


On Tue, Aug 02, 2011 at 08:06:18AM +0200, Martin Matuska wrote:
> Dňa 2. 8. 2011 7:33, Pawel Jakub Dawidek wrote / napísal(a):
> > On Mon, Aug 01, 2011 at 12:35:33AM +0200, Martin Matuska wrote:
> >> The txg_delay() function in txg.c uses the following initialization:
> >> int timeout = ddi_get_lbolt() + ticks;
> >>
> >> Later, we have:
> >>         while (ddi_get_lbolt() < timeout &&
> >>             tx->tx_syncing_txg < txg-1 && !txg_stalled(dp))
> >>                 (void) cv_timedwait(&tx->tx_quiesce_more_cv,
> >> &tx->tx_sync_lock,
> >>                     timeout - ddi_get_lbolt());
> >>
> >> The function txg_delay() is called from:
> >> dsl_pool_tempreserve_space() and dsl_dir_tempreserve_space()
> >>
> >> In 24.855 days ddi_get_lbolt will be never smaller than timeout.
> >>
> >> Please review and/or comment the attached patch.
> > Looks good to me. Can you elaborate a bit on consequences of such
> > overflow? How the problem manifests itself?
> >
> > BTW. Is this something that affects IllumOS as well? If so, it would be
> > nice to share with them.
> >
> >> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c
> >> ===================================================================
> >> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c	(revision 224527)
> >> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c	(working copy)
> >> @@ -488,7 +488,7 @@
> >>  txg_delay(dsl_pool_t *dp, uint64_t txg, int ticks)
> >>  {
> >>  	tx_state_t *tx = &dp->dp_tx;
> >> -	int timeout = ddi_get_lbolt() + ticks;
> >> +	clock_t timeout = ddi_get_lbolt() + ticks;
> >>  
> >>  	/* don't delay if this txg could transition to quiesing immediately */
> >>  	if (tx->tx_open_txg > txg ||
> The overflow causes txg_delay() not delaying txg threads anymore.
> It is called from dsl_pool_tempreserve_space() in dsl_pool.c and from
> dsl_dir_tempreserve_space() in dsl_dir.c from busy transaction groups:
> 
> dsl_pool.c:
> /*
> * If this transaction group is over 7/8ths capacity, delay
> * the caller 1 clock tick. This will slow down the "fill"
> * rate until the sync process can catch up with us.
> */
> if (reserved && reserved > (write_limit - (write_limit >> 3)))
> txg_delay(dp, tx->tx_txg, 1);
> 
> dsl_dir.c:
> err = arc_tempreserve_space(lsize, tx->tx_txg);
> if (err == 0) {
> struct tempreserve *tr;
> 
> tr = kmem_zalloc(sizeof (struct tempreserve), KM_SLEEP);
> tr->tr_size = lsize;
> list_insert_tail(tr_list, tr);
> 
> err = dsl_pool_tempreserve_space(dd->dd_pool, asize, tx);
> } else {
> if (err == EAGAIN) {
> txg_delay(dd->dd_pool, tx->tx_txg, 1);
> err = ERESTART;
> }
> dsl_pool_memory_pressure(dd->dd_pool);
> }
> 
> In my interpretation, this overflow will increase thread concurrency and
> memory pressure on the pool and therefore slow down write speed.
> 
> I have filed this yesterday as Illumos bug #1313.
> https://www.illumos.org/issues/1313

Ok, thanks.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://yomoli.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/zfs-devel/attachments/20110802/8571011d/attachment.pgp


More information about the zfs-devel mailing list