[CFR][ZFS] merge Illumos rev. 13379 (writing to imbalanced
vdevs)
Martin Matuska
mm at FreeBSD.org
Wed Jul 6 12:45:17 UTC 2011
Dňa 29.06.2011 19:42, Pawel Jakub Dawidek wrote / napísal(a):
> On Mon, Jun 20, 2011 at 11:16:50AM +0200, Martin Matuska wrote:
>> zfs tries to allocate blocks evenly across all devices. This means when
>> devices are imbalanced zfs will lots of CPU searching for space on devices
>> which tend to be pretty full.
>>
>> https://www.illumos.org/issues/1051
>> https://www.illumos.org/projects/illumos-gate/repository/revisions/13379
>>
>> If there are no objections, I will commit this to -HEAD
> Some comments inline.
>
>> @@ -5137,6 +5138,7 @@
>> */
>> kernel_init(FREAD | FWRITE);
>> VERIFY(spa_open(zs->zs_pool, &spa, FTAG) == 0);
>> + spa->spa_debug = B_TRUE;
>> zs->zs_spa = spa;
>>
>> spa->spa_dedup_ditto = 2 * ZIO_DEDUPDITTO_MIN;
> Do we want spa debug to be enabled by default? If we don't then please
> enable it only when DEBUG is defined.
>
We are here in ztest.c, spa_debug (new boolean) is disabled by default.
>> /*
>> + * This value defines the number of allowed allocation failures per vdev.
>> + * If a device reaches this threshold in a given txg then we consider skipping
>> + * allocations on that device.
>> + */
>> +int zfs_mg_alloc_failures;
> In FreeBSD we probably want sysctl and loader tunable for this.
> I think it should be fine to make sysctl read/write, but I haven't
> looked very closely.
zio.c:
This is called in zio_init(), called by spa_init():
/*
* The zio write taskqs have 1 thread per cpu, allow 1/2 of the
taskqs
* to fail 3 times per txg or 8 failures, whichever is greater.
*/
zfs_mg_alloc_failures = MAX((3 * max_ncpus / 2), 8);
Looks like it is set on runtime so we have to change the code to make it
properly tunable and set a default value.
Maye comment out the code above and directly initialize with:
int zfs_mg_alloc_failures = MAX((3 * max_ncpus / 2), 8);
The setup might look like this:
TUNABLE_INT("vfs.zfs.mg_alloc_failures", &zfs_mg_alloc_failures);
SYSCTL_INT(_vfs_zfs, OID_AUTO, mg_alloc_failures, CTLFLAG_RW,
&zfs_mg_alloc_failures, 0,
"Number of allowed allocation failures per vdev");
Another question is if it should be under vfs.zfs. or vfs.zfs.vdev.
--
Martin Matuska
FreeBSD committer
http://blog.vx.sk
More information about the zfs-devel
mailing list