Re: zfs replication tool

From: David Christensen <dpchrist_at_holgerdanske.com>
Date: Fri, 23 Sep 2022 22:09:25 UTC
On 9/23/22 02:25, Julien Cigar wrote:
> On Fri, Sep 23, 2022 at 01:57:54AM -0700, David Christensen wrote:
>> On 9/23/22 01:03, Julien Cigar wrote:
>>
>>> I'm still getting replication errors with zrepl on my test installation,
>>> see https://github.com/zrepl/zrepl/issues/631 for details. The initial
>>> transfer works, but after that I'm getting weird "cannot receive
>>> incremental stream: destination ..." messages, despite all snapshots
>>> have been well preserved on both side

>> Does zrepl have xtrace, verbose, debug, etc., options?
>>
> 
> yes, but I haven't found an explanation why some snapshots aren't
> transferred
> 
>>
>> I wrote homebrew scripts to automate ZFS replication and I saw that error
>> message many times in the past.  The solution was to add the '-F' option to
>> the 'zfs receive' commands.  Is there a way to do this with zrepl?
>>
> 
> I'm not sure, and even if there was the possibility I'd like to
> understand why it is needed as all snapshots are there. Rollback to the
> most recent snapshot shouldn't be necessary in this case..

On 9/23/22 02:52, Julien Cigar wrote:
 > mmh could it be because target dataset is mounted and because of atime
 > (or ...) issues...?


AIUI FreeBSD < 13 is ZOL and 13 <= FreeBSD is OpenZFS.  The difference 
may matter.  Please run and post:

$ freebsd-version ; uname -a


For my backup via ZFS replication process, I used to automatically 
import the destination pool read-write on the backup server at boot time 
via /etc/rc.conf, and I used to see the "cannot receive incremental 
stream: destination pool/dataset has been modified" error message.  Now 
I run a script to import the destination pool using the -R (altroot) and 
'-o ro' (mount filesystems readonly) options.  I am considering trying 
the -N (do not mount filesystems) option instead of '-o ro'.  Now I 
typically see the error message only when I have modified the topology 
and/or snapshots of the live or backup datasets (e.g. the cause of the 
error message is known).


If you see an unexpected error message again, please fully document it 
and post the documentation somewhere (redact confidential information as 
needed):

1.  Listing of snapshots of involved datasets; source and destination.

2.  Properties of involved pools and dataset(s); source and destination.

3.  zrepl command, arguments, options, configuration, standard input, 
standard output, standard error, logs, reports, whatever, with xtrace, 
versbose, debug, etc., enabled.


On 9/23/22 04:24, Julien Cigar wrote:
 > it looks like there is no way to specify a -F on the receive side with
 > zrepl.. my problem looks like
 > https://github.com/zrepl/zrepl/issues/408 issue.


I am unsure if a lack of '-F' is a bug or a feature.  zrepl is far more 
sophisticated than anything I do; their model might eliminate the need 
for '-F'.


On 9/23/22 05:43, Julien Cigar wrote:
 > setting mountpoint=none on the target dataset fixed the issue.
 > Apparently mounting a dataset, even as read-only, somewhat "alters" it
 > and incremental snapshot replication fails afterwards.
 >
 > So I'm setting mountpoint=none


That sounds like a variation of the '-o ro' and '-N' themes.  '-o ro' 
has the advantage that you can see the endpoint filesystems and volumes; 
for recovery, validation, whatever.  I assume '-N' mounts neither the 
filesystem nor the snapshots, but mounting just the snapshots could be 
useful.  '-o mountpoint=none' would seem to have similar practical 
effect as '-N'.


 > and switched to clones and promotes


I played with clones years ago.  I do not recall trying 'promote'. 
Fortunately, I do not need clones.


David