gmirror bugs, how many?
João Carlos Mendes Luís
jonny at jonny.eng.br
Wed Nov 24 13:30:12 PST 2004
Hi,
I am blindly testing gmirror, just for fun. I got an old 8G drive
and did some tests. Maybe I did find a bug in gmirror. This is a long
message, but please read it to the end if you are a gmirror or GEOM hacker.
First, I partioned (fdisk) for a full FreeBSD system, with
sysinstall, which got me this:
******* Working on device /dev/ad1 *******
parameters extracted from in-core disklabel are:
cylinders=16368 heads=16 sectors/track=63 (1008 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=16368 heads=16 sectors/track=63 (1008 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 16498881 (8056 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 1023/ head 15/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
Then I tried to compose a single disk gmirror with the whole ad1 disk:
sigesc::root jcmendes [531] gmirror list
sigesc::root jcmendes [532] gmirror label -b load -v vol0 ad1
Metadata value stored on ad1.
Done.
sigesc::root jcmendes [533] gmirror list
Geom name: vol0
State: COMPLETE
Components: 1
Balance: load
Slice: 4096
Flags: NONE
SyncID: 1
ID: 1397575407
Providers:
1. Name: mirror/vol0
Mediasize: 8447458816 (7.9G)
Sectorsize: 512
Mode: r0w0e0
Consumers:
1. Name: ad1
Mediasize: 8447459328 (7.9G)
Sectorsize: 512
Mode: r0w0e0
State: ACTIVE
Priority: 0
Flags: NONE
SyncID: 1
ID: 3966559351
Geom name: vol0.sync
sigesc::root jcmendes [534] ls -l /dev/mirror/
total 1
dr-xr-xr-x 2 root wheel 512 Nov 24 18:45 .
dr-xr-xr-x 5 root wheel 512 Nov 24 18:45 ..
crw-r----- 1 root operator 4, 50 Nov 24 18:45 vol0
crw-r----- 1 root operator 4, 51 Nov 24 18:45 vol0s1
crw-r----- 1 root operator 4, 52 Nov 24 18:45 vol0s1a
crw-r----- 1 root operator 4, 53 Nov 24 18:45 vol0s1b
crw-r----- 1 root operator 4, 54 Nov 24 18:45 vol0s1c
crw-r----- 1 root operator 4, 55 Nov 24 18:45 vol0s1d
sigesc::root jcmendes [535] fdisk /dev/mirror/vol0
******* Working on device /dev/mirror/vol0 *******
parameters extracted from in-core disklabel are:
cylinders=1027 heads=255 sectors/track=63 (16065 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=1027 heads=255 sectors/track=63 (16065 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 16498881 (8056 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 1023/ head 15/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
sigesc::root jcmendes [536]
Aparently, everything is fine until here. But now:
sigesc::root jcmendes [536] disklabel /dev/mirror/vol0s1
# /dev/mirror/vol0s1:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 1048576 63 4.2BSD 2048 16384 8
b: 1048576 1048639 swap
c: 16498881 63 unused 0 0 # "raw" part,
don't edit
d: 14401729 2097215 4.2BSD 2048 16384 28552
partition c: partition extends past end of unit
disklabel: partition c doesn't start at 0!
disklabel: An incorrect partition c may cause problems for standard
system utilities
partition d: partition extends past end of unit
sigesc::root jcmendes [537]
Obviously, this must not be correct.
I try to check the base disk, but:
sigesc::root jcmendes [542] disklabel /dev/ad1s1
disklabel: /dev/ad1s1: No such file or directory
sigesc::root jcmendes [543] ls -l /dev/ad1*
crw-r----- 1 root operator 4, 16 Nov 24 18:58 /dev/ad1
sigesc::root jcmendes [544]
Hey, where are the base partition slices?
Now, lets reboot. I could not unload geom_mirror, since it was
preloaded during boot, is this expected? The device could not be
unloaded, but the volume disapeared (gmirror list, ls /dev/mirror).
This is surely not good. Thats why I did reboot. Bug #1.
After the reboot, the device is back (gmirror list). And,
surprise, the disklabel is magically corrected:
sigesc::root jcmendes [504] disklabel mirror/vol0s1
# /dev/mirror/vol0s1:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 1048576 0 4.2BSD 2048 16384 8
b: 1048576 1048576 swap
c: 16498881 0 unused 0 0 # "raw" part,
don't edit
d: 14401729 2097152 4.2BSD 2048 16384 28552
sigesc::root jcmendes [505]
Ok, now let's try something diferent. Let's suppose that I only
want one slice mirrored. Maybe the other slices could be standalone, or
striped, this is not important now. Let's just say I do want to mirror
ad1s1, instead of the whole ad1.
sigesc::root jcmendes [506] gmirror remove vol0 ad1
sigesc::root jcmendes [507] gmirror label -b load -v vol0 ad1s1
Metadata value stored on ad1s1.
Done.
sigesc::root jcmendes [508] gmirror list
Geom name: vol0
State: COMPLETE
Components: 1
Balance: load
Slice: 4096
Flags: NONE
SyncID: 1
ID: 3056186377
Providers:
1. Name: mirror/vol0
Mediasize: 8447426560 (7.9G)
Sectorsize: 512
Mode: r0w0e0
Consumers:
1. Name: ad1
Mediasize: 8447459328 (7.9G)
Sectorsize: 512
Mode: r0w0e0
State: ACTIVE
Priority: 0
Flags: NONE
SyncID: 1
ID: 4157180820
Geom name: vol0.sync
sigesc::root jcmendes [509]
Note that the volume size now is different: 8447426560, instead of
8447458816, for the previous config. This means 32256 bytes, or 63
sectors. It's apparently ok.
But the consumer name is still ad1, and not ad1s1. Hey, let's check:
sigesc::root jcmendes [510] dd count=1 if=/dev/ad1 of=/tmp/1
1+0 records in
1+0 records out
512 bytes transferred in 0.038226 secs (13394 bytes/sec)
sigesc::root jcmendes [511] dd count=1 if=/dev/mirror/vol0 of=/tmp/2
1+0 records in
1+0 records out
512 bytes transferred in 0.000713 secs (717982 bytes/sec)
sigesc::root jcmendes [512] cmp /tmp/1 /tmp/2
sigesc::root jcmendes [513] dd count=1 if=/dev/ad1 skip=63 of=/tmp/1
1+0 records in
1+0 records out
512 bytes transferred in 0.000655 secs (781471 bytes/sec)
sigesc::root jcmendes [514] cmp /tmp/1 /tmp/2
/tmp/1 /tmp/2 differ: char 1, line 1
sigesc::root jcmendes [515]
Oops. It seens that gmirror got the right size and the wrong
offset. And I did not need to do all this. I could simply use ls:
sigesc::root jcmendes [516] ls -l /dev/mirror/
total 1
dr-xr-xr-x 2 root wheel 512 Nov 24 19:06 .
dr-xr-xr-x 5 root wheel 512 Nov 24 19:06 ..
crw-r----- 1 root operator 4, 33 Nov 24 19:06 vol0
crw-r----- 1 root operator 4, 34 Nov 24 19:06 vol0s1
crw-r----- 1 root operator 4, 35 Nov 24 19:06 vol0s1a
crw-r----- 1 root operator 4, 36 Nov 24 19:06 vol0s1b
crw-r----- 1 root operator 4, 37 Nov 24 19:06 vol0s1c
crw-r----- 1 root operator 4, 38 Nov 24 19:06 vol0s1d
sigesc::root jcmendes [517]
If gmirror was only mirroring the ad1s1 slice, it should not see
new slices inside. I would expect to find vol0 and vol0[abcd] only...
Disklabel is still crazy, and fdisk detects the slices it should'nt:
sigesc::root jcmendes [518] disklabel /dev/mirror/vol0s1
# /dev/mirror/vol0s1:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 1048576 63 4.2BSD 2048 16384 8
b: 1048576 1048639 swap
c: 16498881 63 unused 0 0 # "raw" part,
don't edit
d: 14401729 2097215 4.2BSD 2048 16384 28552
partition c: partition extends past end of unit
disklabel: partition c doesn't start at 0!
disklabel: An incorrect partition c may cause problems for standard
system utilities
partition d: partition extends past end of unit
sigesc::root jcmendes [519] fdisk /dev/mirror/vol0
******* Working on device /dev/mirror/vol0 *******
parameters extracted from in-core disklabel are:
cylinders=1027 heads=255 sectors/track=63 (16065 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=1027 heads=255 sectors/track=63 (16065 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 16498881 (8056 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 1023/ head 15/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
sigesc::root jcmendes [520]
Now let's reboot again.
sigesc::root jcmendes [503] disklabel /dev/mirror/vol0s1a
# /dev/mirror/vol0s1a:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 1048576 63 4.2BSD 2048 16384 8
b: 1048576 1048639 swap
c: 16498881 63 unused 0 0 # "raw" part,
don't edit
d: 14401729 2097215 4.2BSD 2048 16384 28552
partition a: partition extends past end of unit
partition b: offset past end of unit
partition b: partition extends past end of unit
partition c: partition extends past end of unit
disklabel: partition c doesn't start at 0!
disklabel: partition c doesn't cover the whole unit!
disklabel: An incorrect partition c may cause problems for standard
system utilities
partition d: offset past end of unit
partition d: partition extends past end of unit
sigesc::root jcmendes [504]
This time, the disklabel did not return to its "good" state. And
the offset bug is repeatable:
sigesc::root jcmendes [507] dd count=1 if=/dev/ad1 of=/tmp/1
1+0 records in
1+0 records out
512 bytes transferred in 0.000647 secs (791553 bytes/sec)
sigesc::root jcmendes [508] dd count=1 if=/dev/mirror/vol0 of=/tmp/2
1+0 records in
1+0 records out
512 bytes transferred in 0.000777 secs (658939 bytes/sec)
sigesc::root jcmendes [509] cmp /tmp/1 /tmp/2
sigesc::root jcmendes [510]
At least, the behaviour of the slice detection on main disk ad1
seems to be ok. The slices reappear if I remove the mirror partition.
sigesc::root jcmendes [513] ls -l /dev/ad1*
crw-r----- 1 root operator 4, 16 Nov 24 19:20 /dev/ad1
sigesc::root jcmendes [514] gmirror remove -v vol0 ad1
Done.
sigesc::root jcmendes [515] gmirror list
sigesc::root jcmendes [516] ls -l /dev/ad1*
crw-r----- 1 root operator 4, 16 Nov 24 19:20 /dev/ad1
crw-r----- 1 root operator 4, 24 Nov 24 19:20 /dev/ad1s1
crw-r----- 1 root operator 4, 25 Nov 24 19:20 /dev/ad1s1a
crw-r----- 1 root operator 4, 26 Nov 24 19:20 /dev/ad1s1b
crw-r----- 1 root operator 4, 27 Nov 24 19:20 /dev/ad1s1c
crw-r----- 1 root operator 4, 28 Nov 24 19:20 /dev/ad1s1d
sigesc::root jcmendes [517]
Now the big question: Which is the expected behaviour of mirroring
a slice? Whichever answer you give me, I'm sure the current behaviour
is right. So, this must be a bug. Bug #2.
Is there any gmirror hacker around to fix these?
More information about the freebsd-hackers
mailing list