RFC: Suggesting ZFS "best practices" in FreeBSD
Michael DeMan
freebsd at deman.com
Wed Jan 23 01:52:35 UTC 2013
On Jan 22, 2013, at 7:04 AM, Warren Block <wblock at wonkity.com> wrote:
> On Tue, 22 Jan 2013, Borja Marcos wrote:
>
>> 1- Dynamic disk naming -> We should use static naming (GPT labels, for instance)
>>
>> ZFS was born in a system with static device naming (Solaris). When you plug a disk it gets a fixed name. As far as I know, at least from my experience with Sun boxes, c1t3d12 is always c1t3d12. FreeBSD's dynamic naming can be very problematic.
>>
>> For example, imagine that I have 16 disks, da0 to da15. One of them, say, da5, dies. When I reboot the machine, all the devices from da6 to da15 will be renamed to the device number -1. Potential for trouble as a minimum.
>>
>> After several different installations, I am preferring to rely on static naming. Doing it with some care can really help to make pools portable from one system to another. I create a GPT partition in each drive, and Iabel it with a readable name. Thus, imagine I label each big partition (which takes the whole available space) as pool-vdev-disk, for example, pool-raidz1-disk1.
>
> I'm a proponent of using various types of labels, but my impression after a recent experience was that ZFS metadata was enough to identify the drives even if they were moved around. That is, ZFS bare metadata on a drive with no other partitioning or labels.
>
> Is that incorrect?
I don't know if it is correct or not, but the best I could figure out was to both label the drives and also force the mapping so the physical and logical drives always show up associated correctly.
I also ended up deciding I wanted the hostname as a prefix for the labels - so if they get moved around to say another machine I can look and know what is going on - 'oh yeah, those disks are from the ones we moved over to this machine'...
Again - no idea if this is right or 'best practice' but it was what I ended up doing since we don't have that 'best practice' document.
Basically what I came to was:
#1. Map the physical drive slots to how they show up in FBSD so if a disk is removed and the machine is rebooted all the disks after that removed one do not have an 'off by one error'. i.e. if you have ada0-ada14 and remove ada8 then reboot - normally FBSD skips that missing ada8 drive and the next drive (that used to be ada9) is now called ada8 and so on...
#2. Use gpart+gnop to deal with 4K disk sizes in a standardized way and also to leave a little extra room so if when doing a replacement disk and that disk is a few MB smaller than the original - it all 'just works'. (All disks are partitioned to a slightly smaller size than their physical capacity).
#3. For ZFS - make the pool run off the labels. The labels include in them the 'adaXXX' physical disk for easy reference. If the disks are moved to another machine (say ada0-ada14 are moved to ada30-44 in a new box) then naturally that is off - but with the original hostname prefix in the label (presuming hostnames are unique) you can tell what is going on. Having the disks in another host I treat as an emergency/temporary situation and the pool can be taken offline and the labels fixed up if the plan is for the disks to live in that new machine for a long time.
Example below on a test box - so if these drives got moved to another machine where ada6 and ada14 are already present -
NAME STATE READ WRITE CKSUM
zpmirrorTEST ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/myhostname-ada6p1 ONLINE 0 0 0
gpt/myhostname-ada14p1 ONLINE 0 0 0
logs
da1
More information about the freebsd-fs
mailing list