Re: read and write back full disk to trigger relocation
Date: Tue, 30 May 2023 23:16:09 UTC
On 5/30/23 02:18, Sysadmin Lists wrote: > David Christensen May 29, 2023, 4:12:24 PM >> Testing dd(1) and gmirror(8): >> >> 2023-05-29 15:21:32 toor@vf1 ~ >> # freebsd-version ; uname -a >> 12.4-RELEASE-p2 >> FreeBSD vf1.tracy.holgerdanske.com 12.4-RELEASE-p1 FreeBSD >> 12.4-RELEASE-p1 GENERIC amd64 >> >> 2023-05-29 15:23:05 toor@vf1 ~ >> # gmirror label mymirror ada3 ada4 >> >> 2023-05-29 15:24:11 toor@vf1 ~ >> # gmirror status mymirror >> Name Status Components >> mirror/mymirror COMPLETE ada3 (ACTIVE) >> ada4 (ACTIVE) >> >> 2023-05-29 15:52:41 toor@vf1 ~ >> # dd if=/dev/ada3 of=/dev/ada3 bs=1m >> dd: /dev/ada3: Operation not permitted >> >> 2023-05-29 15:53:45 toor@vf1 ~ >> # dd if=/dev/ada4 of=/dev/ada4 bs=1m >> dd: /dev/ada4: Operation not permitted >> >> 2023-05-29 15:53:52 toor@vf1 ~ >> # dd if=/dev/mirror/mymirror of=/dev/mirror/mymirror bs=1m >> 1023+1 records in >> 1023+1 records out >> 1073741312 bytes transferred in 3.299006 secs (325474224 bytes/sec) >> >> >> This confirms that the kernel will not allow writes to mirror components >> when they are active, as it should. If a process could write to a >> component of a mirror, that would bypass the mirror driver, defeat the >> purpose of the mirror, allow race conditions, and result in data loss/ >> data corruption. > > That makes sense. I wouldn't recommend running it on a live system anyway. > Probably wiser to boot into a livecd and run it on a single disk. gmirror > shouldn't notice a difference since the data isn't presently corrupted, just > decaying (is my guess). 3TB is a lot of data to process. I also prefer to do disk maintenance activities when the disks are off-line, typically by booting alternate media (such as a live USB stick). I did the above testing on VirtualBox on Debian by creating two 1 GB virtual disks backed by files. When I created the virtual disks, I choose "Dynamic" sizing -- e.g. the backing files start small and grow as data is added. I have since noted that the size, mtime, and atime on the backing files have not changed since the files were created: 2023-05-30 15:56:36 dpchrist@taz ~/virtualbox/virtual-machines/vf1 $ stat vf1_?.vdi File: vf1_3.vdi Size: 2097152 Blocks: 16 IO Block: 4096 regular file Device: fe02h/65026d Inode: 392462 Links: 1 Access: (0600/-rw-------) Uid: (13250/dpchrist) Gid: (13250/dpchrist) Access: 2023-05-29 15:20:35.292781334 -0700 Modify: 2023-05-29 15:19:51.553228088 -0700 Change: 2023-05-29 15:19:51.553228088 -0700 Birth: 2023-05-29 15:13:28.182411743 -0700 File: vf1_4.vdi Size: 2097152 Blocks: 16 IO Block: 4096 regular file Device: fe02h/65026d Inode: 392466 Links: 1 Access: (0600/-rw-------) Uid: (13250/dpchrist) Gid: (13250/dpchrist) Access: 2023-05-29 15:20:35.292781334 -0700 Modify: 2023-05-29 15:19:51.553228088 -0700 Change: 2023-05-29 15:19:51.553228088 -0700 Birth: 2023-05-29 15:13:44.630780217 -0700 If I do the dd(1) command again with O_DIRECT: 2023-05-30 15:59:06 toor@vf1 ~ # dd if=/dev/mirror/mymirror of=/dev/mirror/mymirror bs=1m oflag=direct 1023+1 records in 1023+1 records out 1073741312 bytes transferred in 3.465168 secs (309867017 bytes/sec) The size, mtime, and atime still do not change: 2023-05-30 15:59:55 dpchrist@taz ~/virtualbox/virtual-machines/vf1 $ stat vf1_?.vdi File: vf1_3.vdi Size: 2097152 Blocks: 16 IO Block: 4096 regular file Device: fe02h/65026d Inode: 392462 Links: 1 Access: (0600/-rw-------) Uid: (13250/dpchrist) Gid: (13250/dpchrist) Access: 2023-05-29 15:20:35.292781334 -0700 Modify: 2023-05-29 15:19:51.553228088 -0700 Change: 2023-05-29 15:19:51.553228088 -0700 Birth: 2023-05-29 15:13:28.182411743 -0700 File: vf1_4.vdi Size: 2097152 Blocks: 16 IO Block: 4096 regular file Device: fe02h/65026d Inode: 392466 Links: 1 Access: (0600/-rw-------) Uid: (13250/dpchrist) Gid: (13250/dpchrist) Access: 2023-05-29 15:20:35.292781334 -0700 Modify: 2023-05-29 15:19:51.553228088 -0700 Change: 2023-05-29 15:19:51.553228088 -0700 Birth: 2023-05-29 15:13:44.630780217 -0700 So, either FreeBSD Or Virtual is optimizing away the write(2) calls because the write buffer matches what is already in a memory cache from prior read(2) calls (?). I would say the experiment should be repeated on real HDD's, but how do I detect if identical data has being written to the platters? The HDD controller also has a cache and could optimize away such writes. One idea would be to read into a buffer, invert the bits in the buffer, write the buffer, invert the bits again, and write again. These are the kinds of issues that the disk manufacturer is supposed to solve. Thus, my first response "I would look for a manufacturer diagnostic tool". David