[Bug 269261] data corruption with fspacectl and mmap

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 31 Jan 2023 00:35:01 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269261

            Bug ID: 269261
           Summary: data corruption with fspacectl and mmap
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: asomers@FreeBSD.org

When using mmap to read and write to a file, intermixed with fspacectl, data
corruption can occur.  It seems like a cacheing bug, as though the data written
via mmap doesn't get evicted from the cache during fspacectl, and is
subsequently returned via mmap reads.  I can reliably reproduce this bug on UFS
and fusefs, but not ZFS, tmpfs or fusefs-ext2.

Steps to reproduce:
0) Install a Rust toolchain
1) checkout https://github.com/asomers/fsx-rs.git at rev c3e726d
2) cd fsx-rs
3) cargo build
4) truncate -s 1g /tmp/ufs.img
5) sudo mdconfig -a -t vnode -f /tmp/ufs.img
6) sudo newfs /dev/md0
7) sudo mount /dev/md0 /mnt
8) sudo mkdir /mnt/tmp
9) sudo chmod 1777 /mnt/tmp
10) cat <<HERE > fsx.toml
  nomsyncafterwrite = true
  [weights]
  close_open = 0.1
  invalidate = 0.2
  truncate = 1
  fsync = 1
  fdatasync = 1
  punch_hole = 100
HERE
10) env RUST_LOG=debug cargo run -- -f fsx.toml -N 1000 -P /tmp -S
2153242284826767701 /mnt/tmp/fsx.bin

The output will look like this:
[INFO  fsx] Using seed 2153242284826767701
[DEBUG fsx]   1 skipping zero size hole punch
[DEBUG fsx]   2 skipping zero size hole punch
[DEBUG fsx]   3 skipping zero size hole punch
[DEBUG fsx]   4 skipping zero size hole punch
[DEBUG fsx]   5 skipping zero size read
[DEBUG fsx]   6 skipping zero size hole punch
[DEBUG fsx]   7 skipping zero size hole punch
[DEBUG fsx]   8 skipping zero size hole punch
[DEBUG fsx]   9 skipping zero size hole punch
[DEBUG fsx]  10 skipping zero size hole punch
[DEBUG fsx]  11 skipping zero size hole punch
[INFO  fsx]  12 mapwrite 0x1ffb4 .. 0x280a4 ( 0x80f1 bytes)
[INFO  fsx]  13 punch_hole  0xe4e5 .. 0x185c2 ( 0xa0de bytes)
[INFO  fsx]  14 punch_hole 0x27eae .. 0x280a4 (  0x1f7 bytes)
[INFO  fsx]  15 punch_hole  0xee3c .. 0x17541 ( 0x8706 bytes)
[INFO  fsx]  16 punch_hole 0x24fc5 .. 0x280a4 ( 0x30e0 bytes)
[INFO  fsx]  17 punch_hole 0x27dc2 .. 0x280a4 (  0x2e3 bytes)
[INFO  fsx]  18 punch_hole 0x14efa .. 0x16be0 ( 0x1ce7 bytes)
[INFO  fsx]  19 mapread   0xe210 .. 0x1153c ( 0x332d bytes)
[INFO  fsx]  20 mapread  0x1159f .. 0x1cb3d ( 0xb59f bytes)
[INFO  fsx]  21 mapread  0x16252 .. 0x21bd3 ( 0xb982 bytes)
[INFO  fsx]  22 punch_hole  0x2c14 ..  0x2d44 (  0x131 bytes)
[INFO  fsx]  23 punch_hole  0xc1b4 .. 0x18eed ( 0xcd3a bytes)
[INFO  fsx]  24 mapwrite 0x36f14 .. 0x3ffff ( 0x90ec bytes)
[INFO  fsx]  25 read      0xe4a9 .. 0x16bf9 ( 0x8751 bytes)
[INFO  fsx]  26 punch_hole  0xeedd .. 0x13904 ( 0x4a28 bytes)
[INFO  fsx]  27 mapwrite 0x2a9e0 .. 0x2c675 ( 0x1c96 bytes)
[INFO  fsx]  28 punch_hole 0x13374 .. 0x1f95e ( 0xc5eb bytes)
[INFO  fsx]  29 mapread   0xff83 .. 0x1bcb8 ( 0xbd36 bytes)
[INFO  fsx]  30 mapwrite 0x3cc44 .. 0x3ffff ( 0x33bc bytes)
[INFO  fsx]  31 mapwrite 0x14b65 .. 0x1969b ( 0x4b37 bytes)
[INFO  fsx]  32 write     0xcc6e .. 0x152f6 ( 0x8689 bytes)
[INFO  fsx]  33 write    0x30da5 .. 0x340ae ( 0x330a bytes)
[INFO  fsx]  34 punch_hole 0x3b300 .. 0x3ffff ( 0x4d00 bytes)
[INFO  fsx]  35 read     0x3d33c .. 0x3ffff ( 0x2cc4 bytes)
[INFO  fsx]  36 punch_hole 0x279cf .. 0x30304 ( 0x8936 bytes)
[INFO  fsx]  37 mapread  0x2441c .. 0x2d04e ( 0x8c33 bytes)
[ERROR fsx] miscompare: offset= 0x2441c, size = 0x8c33
[ERROR fsx] OFFSET  GOOD  BAD  RANGE  
[ERROR fsx] 0x24fc5 0x00 0x4f  0x3024
[ERROR fsx] Step# (mod 256) for a misdirected write may be 12
[ERROR fsx] LOG DUMP
[ERROR fsx]   0 SKIPPED  (punch_hole)
[ERROR fsx]   1 SKIPPED  (punch_hole)
[ERROR fsx]   2 SKIPPED  (punch_hole)
[ERROR fsx]   3 SKIPPED  (punch_hole)
[ERROR fsx]   4 SKIPPED  (read)
[ERROR fsx]   5 SKIPPED  (punch_hole)
[ERROR fsx]   6 SKIPPED  (punch_hole)
[ERROR fsx]   7 SKIPPED  (punch_hole)
[ERROR fsx]   8 SKIPPED  (punch_hole)
[ERROR fsx]   9 SKIPPED  (punch_hole)
[ERROR fsx]  10 SKIPPED  (punch_hole)
[ERROR fsx]  11 MAPWRITE 0x1ffb4 => 0x280a5 ( 0x80f1 bytes) HOLE
[ERROR fsx]  12 PUNCH_HOLE  0xe4e5 => 0x185c2 ( 0xa0de bytes)
[ERROR fsx]  13 PUNCH_HOLE 0x27eae => 0x280a4 (  0x1f7 bytes)
[ERROR fsx]  14 PUNCH_HOLE  0xee3c => 0x17541 ( 0x8706 bytes)
[ERROR fsx]  15 PUNCH_HOLE 0x24fc5 => 0x280a4 ( 0x30e0 bytes)
[ERROR fsx]  16 PUNCH_HOLE 0x27dc2 => 0x280a4 (  0x2e3 bytes)
[ERROR fsx]  17 PUNCH_HOLE 0x14efa => 0x16be0 ( 0x1ce7 bytes)
[ERROR fsx]  18 MAPREAD   0xe210 => 0x1153d ( 0x332d bytes)
[ERROR fsx]  19 MAPREAD  0x1159f => 0x1cb3e ( 0xb59f bytes)
[ERROR fsx]  20 MAPREAD  0x16252 => 0x21bd4 ( 0xb982 bytes)
[ERROR fsx]  21 PUNCH_HOLE  0x2c14 =>  0x2d44 (  0x131 bytes)
[ERROR fsx]  22 PUNCH_HOLE  0xc1b4 => 0x18eed ( 0xcd3a bytes)
[ERROR fsx]  23 MAPWRITE 0x36f14 => 0x40000 ( 0x90ec bytes) HOLE
[ERROR fsx]  24 READ      0xe4a9 => 0x16bfa ( 0x8751 bytes)
[ERROR fsx]  25 PUNCH_HOLE  0xeedd => 0x13904 ( 0x4a28 bytes)
[ERROR fsx]  26 MAPWRITE 0x2a9e0 => 0x2c676 ( 0x1c96 bytes)
[ERROR fsx]  27 PUNCH_HOLE 0x13374 => 0x1f95e ( 0xc5eb bytes)
[ERROR fsx]  28 MAPREAD   0xff83 => 0x1bcb9 ( 0xbd36 bytes)
[ERROR fsx]  29 MAPWRITE 0x3cc44 => 0x40000 ( 0x33bc bytes)
[ERROR fsx]  30 MAPWRITE 0x14b65 => 0x1969c ( 0x4b37 bytes)
[ERROR fsx]  31 WRITE     0xcc6e => 0x152f7 ( 0x8689 bytes)
[ERROR fsx]  32 WRITE    0x30da5 => 0x340af ( 0x330a bytes)
[ERROR fsx]  33 PUNCH_HOLE 0x3b300 => 0x3ffff ( 0x4d00 bytes)
[ERROR fsx]  34 READ     0x3d33c => 0x40000 ( 0x2cc4 bytes)
[ERROR fsx]  35 PUNCH_HOLE 0x279cf => 0x30304 ( 0x8936 bytes)
[ERROR fsx]  36 MAPREAD  0x2441c => 0x2d04f ( 0x8c33 bytes)

I also have a unit test that can reproduce this with fusefs in 3 operations.

-- 
You are receiving this mail because:
You are the assignee for the bug.