Adding to a zpool -- different redundancies and risks
David Christensen
dpchrist at holgerdanske.com
Fri Dec 13 04:50:03 UTC 2019
On 2019-12-12 04:42, Norman Gray wrote:
>
> David, hello.
Hi. :-)
> On 12 Dec 2019, at 5:11, David Christensen wrote:
>
>> Please post:
>>
>> 1 The 'zpool create ...' command you used to create the existing
>> pool.
On 2019-12-12 06:33, Norman Gray wrote:
> # zpool history pool
> History for 'pool':
> 2017-08-20.15:45:43 zpool create -m /pool pool raidz2 da2 da3 da4 da5
> da6 da7 da8 da9 da10 raidz2 da11 da12 da13 da14 da15 da16 da17 da18 da19
Okay.
On 2019-12-12 04:42, Norman Gray wrote:
>> 2. The output of 'zpool status' for the existing pool.
>
> # zpool status pool
> pool: pool
> state: ONLINE
> status: Some supported features are not enabled on the pool. The pool
> can
> still be used, but some features are unavailable.
> action: Enable all features using 'zpool upgrade'. Once this is done,
> the pool may no longer be accessible by software that does not support
> the features. See zpool-features(7) for details.
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> pool ONLINE 0 0 0
> raidz2-0 ONLINE 0 0 0
> label/zd032 ONLINE 0 0 0
> label/zd033 ONLINE 0 0 0
> label/zd034 ONLINE 0 0 0
> label/zd035 ONLINE 0 0 0
> label/zd036 ONLINE 0 0 0
> label/zd037 ONLINE 0 0 0
> label/zd038 ONLINE 0 0 0
> label/zd039 ONLINE 0 0 0
> label/zd040 ONLINE 0 0 0
> raidz2-1 ONLINE 0 0 0
> label/zd041 ONLINE 0 0 0
> label/zd042 ONLINE 0 0 0
> label/zd043 ONLINE 0 0 0
> label/zd044 ONLINE 0 0 0
> label/zd045 ONLINE 0 0 0
> label/zd046 ONLINE 0 0 0
> label/zd047 ONLINE 0 0 0
> label/zd048 ONLINE 0 0 0
> label/zd049 ONLINE 0 0 0
>
> errors: No known data errors
> #
>
> (Note: since creating the pool, I realised that gpart labels were a Good
> Thing, hence exported, labelled, and imported the pool, hence the
> difference from the da* pool creation).
So, two raidz2 vdev's of nine 5.5 TB drives each, striped into one pool.
Each vdev can store 7 * 5.5 = 38.5 TB and the pool can store 38.5 +
38.5 = 77 TB.
>> 3. The output of 'zpool list' for the existing pool.
>
> # zpool list pool
> NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP
> HEALTH ALTROOT
> pool 98T 75.2T 22.8T - - 29% 76% 1.00x
> ONLINE -
So, your pool is 75.2 TB / 77 TB = 97.7% full.
>> 4. The 'zpool add ...' command you are contemplating.
>
> # zpool add -n pool raidz2 label/zd05{0,1,2,3,4,5}
> invalid vdev specification
> use '-f' to override the following errors:
> mismatched replication level: pool uses 9-way raidz and new vdev uses
> 6-way raidz
I believe your understanding of the warning is correct -- ZFS is saying
that the added raidz2 vdev does not having the same number of drives
(six) as the two existing raidz2 vdev's (nine drives each).
> The six new disks are 12TB; the 18 original ones 5.5TB.
As you stated before.
>> So, you have 24 drives in a 24 drive cage?
>
> That's correct -- the maximum the chassis will take.
Okay.
>> What are your space and performance goals?
>
> Not very explicit: TB/currency-unit as high as possible. Performance:
> bottlenecks are likely to be elsewhere (network, processing power) so no
> stringent requirements. Though this is a fairly general-purpose data
> store, a large fraction of the datasets on the machine comprise a number
> of 10GB single files, served via NFS.
>
>> What are your sustainability goals as drives and/or VDEV's fail?
>
> It doesn't have to be high availability, so if I have a drive failure, I
> can consider shutting the machine down until a replacement disk arrives
> and can be resilvered. This is a mirror of data where the masters are
> elsewhere on the planet, so this machine is 'reliable storage but not
> backed up' (and the users know this). Thus if I do decide to keep
> running with one failed disk in one VDEV, and the worst comes to the
> worst and the whole thing explodes... the world won't end. I will be
> cross, and users will moan, in either case, but they know this is a
> problem that can fundamentally be solved with more money.
>
> I'm sure I could be more sophisticated about this (and any suggestions
> are welcome), but unfortunately I don't have as much time to spend on
> storage problems as I'd like, so I'd like to avoid creating a setup
> which is smarter than I'm able to fix!
>
> Best wishes,
>
> Norman
Okay.
I believe that if you gave the -f option to 'zfs add', the six 12 TB
drives would be formed into a raidz2 vdev and this new vdev would be
striped onto your existing pool. The pool would then have a total
capacity of 38.5 + 38.5 + 48 = 125 TB and ZFS would start spreading your
data across the three vdev's (potentially improving performance under
concurrent workloads). The pool could withstand two drive failures in
any single raidz2 vdev, but three drive failures in the same vdev would
result in total data loss.
That said, read this article:
https://jrs-s.net/2015/02/06/zfs-you-should-use-mirror-vdevs-not-raidz/
So:
1. Pair the eighteen 5.5 TB drives into nine 5.5 TB mirrors (49.5 TB).
2. Pair the six 12 TB drives into three 12 TB mirrors (36 TB).
3. Stripe all the mirrors into a pool (85.5 TB).
The pool could withstand one drive failure in any single mirror, but two
drive failures in the same mirror would result in total data loss. The
risk of total loss would be especially apparent from the time a drive
fails until the time its replacement is resilvered.
AIUI this architecture has another benefit -- incremental pool growth.
You replace one 5.5 TB drive in a mirror with a 12 TB drive, resilver,
replace the other 5.5 TB drive in the same mirror with another 12 TB
drive, resilver, and now the pool is 6.5 TB larger. In the long run,
you end up with twenty-four 12 TB drives (144 TB pool). The process
could then be repeated (or preempted) using even bigger drives.
David
More information about the freebsd-questions
mailing list