[PATCH] fix integer overflow in txg_delay()
Martin Matuska
mm at FreeBSD.org
Tue Aug 2 06:06:20 UTC 2011
Dňa 2. 8. 2011 7:33, Pawel Jakub Dawidek wrote / napísal(a):
> On Mon, Aug 01, 2011 at 12:35:33AM +0200, Martin Matuska wrote:
>> The txg_delay() function in txg.c uses the following initialization:
>> int timeout = ddi_get_lbolt() + ticks;
>>
>> Later, we have:
>> while (ddi_get_lbolt() < timeout &&
>> tx->tx_syncing_txg < txg-1 && !txg_stalled(dp))
>> (void) cv_timedwait(&tx->tx_quiesce_more_cv,
>> &tx->tx_sync_lock,
>> timeout - ddi_get_lbolt());
>>
>> The function txg_delay() is called from:
>> dsl_pool_tempreserve_space() and dsl_dir_tempreserve_space()
>>
>> In 24.855 days ddi_get_lbolt will be never smaller than timeout.
>>
>> Please review and/or comment the attached patch.
> Looks good to me. Can you elaborate a bit on consequences of such
> overflow? How the problem manifests itself?
>
> BTW. Is this something that affects IllumOS as well? If so, it would be
> nice to share with them.
>
>> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c
>> ===================================================================
>> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c (revision 224527)
>> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c (working copy)
>> @@ -488,7 +488,7 @@
>> txg_delay(dsl_pool_t *dp, uint64_t txg, int ticks)
>> {
>> tx_state_t *tx = &dp->dp_tx;
>> - int timeout = ddi_get_lbolt() + ticks;
>> + clock_t timeout = ddi_get_lbolt() + ticks;
>>
>> /* don't delay if this txg could transition to quiesing immediately */
>> if (tx->tx_open_txg > txg ||
The overflow causes txg_delay() not delaying txg threads anymore.
It is called from dsl_pool_tempreserve_space() in dsl_pool.c and from
dsl_dir_tempreserve_space() in dsl_dir.c from busy transaction groups:
dsl_pool.c:
/*
* If this transaction group is over 7/8ths capacity, delay
* the caller 1 clock tick. This will slow down the "fill"
* rate until the sync process can catch up with us.
*/
if (reserved && reserved > (write_limit - (write_limit >> 3)))
txg_delay(dp, tx->tx_txg, 1);
dsl_dir.c:
err = arc_tempreserve_space(lsize, tx->tx_txg);
if (err == 0) {
struct tempreserve *tr;
tr = kmem_zalloc(sizeof (struct tempreserve), KM_SLEEP);
tr->tr_size = lsize;
list_insert_tail(tr_list, tr);
err = dsl_pool_tempreserve_space(dd->dd_pool, asize, tx);
} else {
if (err == EAGAIN) {
txg_delay(dd->dd_pool, tx->tx_txg, 1);
err = ERESTART;
}
dsl_pool_memory_pressure(dd->dd_pool);
}
In my interpretation, this overflow will increase thread concurrency and
memory pressure on the pool and therefore slow down write speed.
I have filed this yesterday as Illumos bug #1313.
https://www.illumos.org/issues/1313
--
Martin Matuska
FreeBSD committer
http://blog.vx.sk
More information about the zfs-devel
mailing list