stable/9 + ZFS IS NOT ready, me thinks
Dennis Glatting
freebsd at pki2.com
Thu Oct 25 15:34:13 UTC 2012
At least that is what I suspect.
As I have previously mentioned, I have five servers with stable/9
running ZFS. Four are AMD systems (similar but not identical) and the
fifth Intel. The AMD systems are the workhorses.
The AMDs have a long history of stalling under load. Specifically, the
kernel, keyboard, display, and network I/O are still there, but the
disks are stalled across all volumes, arrays, and disks (e.g., if I
enter a command not on the disks, such as on a memory disk, and
statically linked, the command will run, otherwise the command DOES NOT
run).
Over the last week I changed operating systems on two of these systems.
System #1 I downgraded to stable/8. System #3 I installed CentOS 6.3
ZFS-on-Linux (ZoL). These two systems have been running the same job
(2d17h on the first and 3d on the second) without trouble. Previously
System #1 would have within 48 hours, typically less than 12, and System
#3 would spontaneously reboot whenever I tried to send a data set via
"zfs send" to it.
On System #1 I found one of the OS disks, a hardware RAID1 array, was
toast. I found and replaced that disk before I installed 8.3. You can
argue the problem with stable/9 was that disk but I don't believe it
because I have the SAME problem across all four systems.
When a new set of disks arrive I plan to re-introduce stable/9 to that
system to see if the faulting returns. Also, smartd says I need to
update the firmware in some of my disks, which I plan to do this weekend
(below).
Under ZoL and 8.3 the systems are more responsive than stable/9. For
example, a "ls" of the busy data set returns data MUCH more quickly
under ZoL and 8.3. Under stable/9 it sputters out the data.
Here is the current load on System #1:
mc# top
last pid: 53918; load averages: 73.73, 73.08, 72.81 up 2+17:58:24
08:16:47
61 processes: 10 running, 51 sleeping
CPU: 11.4% user, 46.0% nice, 42.6% system, 0.1% interrupt, 0.0% idle
Mem: 702M Active, 1003M Inact, 35G Wired, 160K Cache, 88M Buf, 88G Free
ARC: 32G Total, 3594M MRU, 27G MFU, 32M Anon, 581M Header, 562M Other
Swap: 233G Total, 233G Free
mc# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
disk-1 16.2T 6.57T 9.68T 40% 1.33x ONLINE -
disk-2 3.62T 3.63G 3.62T 0% 1.00x ONLINE -
All of the data is going onto disk-1 which had under 10GB when I started
the job.
Here is System #3, running the same job but has only 25% of the cores as
System #1:
[root at rotfl ~]# top
top - 08:19:13 up 3 days, 16:13, 7 users, load average: 94.61, 94.57,
100.94
Tasks: 710 total, 10 running, 700 sleeping, 0 stopped, 0 zombie
Cpu(s): 13.3%us, 4.4%sy, 82.2%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 65951592k total, 39561920k used, 26389672k free, 154372k buffers
Swap: 134217720k total, 0k used, 134217720k free, 377996k
cached
[root at rotfl ~]# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
disk-1 16.2T 6.72T 9.53T 41% 1.00x ONLINE -
disk-2 1.81T 3.24G 1.81T 0% 1.00x ONLINE -
Like System #1, the data is going to disk-1 which also had less than
10GB when started.
I am working on getting many TB of data off one of the remaining two
stable/9 systems for more experimentation but the system stalls, which
makes the process a bit cumbersome. I strongly suspect a contributing
factor is the system cron scripts that run at night.
Finally, as I have also previously mentioned, I am NOT the only one
having this problem. One individual stated that he did update his BIOS,
his controller firmware, and disk firmware but that didn't help.
I am happy to work with FreeBSD component knowledgeable folks but only
one stepped forward.
More information about the freebsd-fs
mailing list