RFC: Suggesting ZFS "best practices" in FreeBSD

Wed Jan 23 01:52:35 UTC 2013

On Jan 22, 2013, at 7:04 AM, Warren Block <wblock at wonkity.com> wrote:

> On Tue, 22 Jan 2013, Borja Marcos wrote:
> 
>> 1- Dynamic disk naming -> We should use static naming (GPT labels, for instance)
>> 
>> ZFS was born in a system with static device naming (Solaris). When you plug a disk it gets a fixed name. As far as I know, at least from my experience with Sun boxes, c1t3d12 is always c1t3d12. FreeBSD's dynamic naming can be very problematic.
>> 
>> For example, imagine that I have 16 disks, da0 to da15. One of them, say, da5, dies. When I reboot the machine, all the devices from da6 to da15 will be renamed to the device number -1. Potential for trouble as a minimum.
>> 
>> After several different installations, I am preferring to rely on static naming. Doing it with some care can really help to make pools portable from one system to another. I create a GPT partition in each drive, and Iabel it with a readable name. Thus, imagine I label each big partition (which takes the whole available space) as pool-vdev-disk, for example, pool-raidz1-disk1.
> 
> I'm a proponent of using various types of labels, but my impression after a recent experience was that ZFS metadata was enough to identify the drives even if they were moved around.  That is, ZFS bare metadata on a drive with no other partitioning or labels.
> 
> Is that incorrect?

I don't know if it is correct or not, but the best I could figure out was to both label the drives and also force the mapping so the physical and logical drives always show up associated correctly.
I also ended up deciding I wanted the hostname as a prefix for the labels - so if they get moved around to say another machine I can look and know what is going on - 'oh yeah, those disks are from the ones we moved over to this machine'...

Again - no idea if this is right or 'best practice' but it was what I ended up doing since we don't have that 'best practice' document.

Basically what I came to was:

#1.  Map the physical drive slots to how they show up in FBSD so if a disk is removed and the machine is rebooted all the disks after that removed one do not have an 'off by one error'.  i.e. if you have ada0-ada14 and remove ada8 then reboot - normally FBSD skips that missing ada8 drive and the next drive (that used to be ada9) is now called ada8 and so on...  

#2.  Use gpart+gnop to deal with 4K disk sizes in a standardized way and also to leave a little extra room so if when doing a replacement disk and that disk is a few MB smaller than the original - it all 'just works'.  (All disks are partitioned to a slightly smaller size than their physical capacity).

#3.  For ZFS - make the pool run off the labels.  The labels include in them the 'adaXXX' physical disk for easy reference.  If the disks are moved to another machine (say ada0-ada14 are moved to ada30-44 in a new box) then naturally that is off - but with the original hostname prefix in the label (presuming hostnames are unique) you can tell what is going on.  Having the disks in another host I treat as an emergency/temporary situation and the pool can be taken offline and the labels fixed up if the plan is for the disks to live in that new machine for a long time.

Example below on a test box - so if these drives got moved to another machine where ada6 and ada14 are already present - 

	NAME                     STATE     READ WRITE CKSUM
	zpmirrorTEST                ONLINE       0     0     0
	  mirror-0               ONLINE       0     0     0
	    gpt/myhostname-ada6p1   ONLINE       0     0     0
	    gpt/myhostname-ada14p1  ONLINE       0     0     0
	logs
	  da1