Question about ZFS with log and cache on SSD with GPT

Daniel Kalchev daniel at digsys.bg
Tue Jan 24 10:18:51 UTC 2012


On Jan 22, 2012, at 6:13 PM, Willem Jan Withagen wrote:

> On 22-1-2012 9:10, Peter Maloney wrote:
>> 
> 
>> In my testing, it made no difference. But as daniel mentioned:
>> 
>>> With ZFS, the 'alignment' is on per-vdev -- therefore you will need to recreate the mirror vdevs again using gnop to make them 4k aligned. 
>> But I just resilvered to add my aligned disks and remove the old. If
>> that applies to erase boundaries, then it might have hurt my test.
> 
> I'm not treally fluent in ZFS lingo, but the vdev is what makes up my
> zfsdata pool? And the alignment in there carries over to the caches
> underneath?
> 
> So what is the consequence if ashift = 9, and the partitions are nicely
> aligned even on the rease-boundary….

ZFS zpool can have a number of "vdevs". These are pieces of storage, that ZFS uses to store your bits of data. ZFS will spread writing to all available vdevs at the time of writing. Each vdev may have different properties, the 'sector size' (the smallest unit for writing/reading the vdev) being one. In ZFS this is stored in the 'shift' property. It's a bit shift value really, so ashift=9 means 2^9 (512) bytes and ashift=12 means 2^12 (4096) bytes.

When you create a vdev in ZFS, by either "zpool create" or "zpool add" ZFS will check the sector sizes reported by each "drive" (which may be file, disk drive, SAN storage, any block device in fact) and use the largest one as the vdev's shift. This is done in order to not penalize large-sector participants in a vdev.

If you add/replace device within an existing vdev, the shift property does not change. I am not aware of any way to change ashift on the fly, short of recreating the vdev. Since in current ZFS you cannot remove a vdev, that means you will have to recreate the zpool.

Today, it is probably good idea to create all new zpools with at least an ashift value of 12 (4096 bytes), or perhaps even larger. Current drives are so huge, that wasted space will not be significant. But performance will be better.

This should be even more important for SSD drives used as ZFS storage (perhaps also for SLOG/ZIL and cache) because that will both make the drive live longer and improve significantly write performance.

I have not experimented with gnop-ing ZIL or cache, then removing the gnop and re-importing pool, but there is no reason why it should not work.

Daniel



More information about the freebsd-fs mailing list