Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini()
Peter Eriksson
pen at lysator.liu.se
Tue Dec 3 08:34:21 UTC 2019
Sure. Here’s one from just before I rebooted last evening (at the 7 hour uptime mark):
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
UMA Kegs: 384, 0, 247, 3, 247, 0, 0
UMA Zones: 4736, 0, 264, 0, 264, 0, 0
UMA Slabs: 80, 0, 2894986, 14, 2898224, 0, 0
UMA Hash: 256, 0, 48, 72, 108, 0, 0
4 Bucket: 32, 0, 3497, 9753, 338875, 0, 0
6 Bucket: 48, 0, 1700, 10501, 89938, 0, 0
8 Bucket: 64, 0, 611, 10053, 118928, 15, 0
12 Bucket: 96, 0, 3293, 6096, 131089, 0, 0
16 Bucket: 128, 0, 834, 5304, 55984, 0, 0
32 Bucket: 256, 0, 2747, 5668, 373187, 139, 0
64 Bucket: 512, 0, 1884, 4436, 483281, 139, 0
128 Bucket: 1024, 0, 4828, 2996, 613296,16078, 0
256 Bucket: 2048, 0, 34106, 1058, 1566552, 164, 0
vmem btag: 56, 0, 904562, 3741, 905062,6399, 0
VM OBJECT: 240, 0, 89303, 5129,24512558, 0, 0
RADIX NODE: 144, 0, 634446, 15444,60638431, 0, 0
MAP: 240, 0, 3, 61, 3, 0, 0
KMAP ENTRY: 128, 0, 12, 639, 12, 0, 0
MAP ENTRY: 128, 0, 105906, 71972,273723932, 0, 0
VMSPACE: 2520, 0, 120, 294, 1064743, 0, 0
fakepg: 104, 0, 7, 3755, 3136, 0, 0
mt_zone: 16400, 0, 420, 0, 420, 0, 0
16: 16, 0, 1810762, 5474,247600975, 0, 0
32: 32, 0, 430077, 113173,777148991, 0, 0
64: 64, 0,24475835, 6313799,350608033, 0, 0
128: 128, 0, 3914907, 76529,394374364, 0, 0
256: 256, 0, 508512, 1081773,201807283, 0, 0
512: 512, 0, 74387, 185061,131179473, 0, 0
1024: 1024, 0, 127067, 8929,51960099, 0, 0
2048: 2048, 0, 48730, 1396, 5337386, 0, 0
4096: 4096, 0, 276060, 827, 6748826, 0, 0
8192: 8192, 0, 54, 68, 1059804, 0, 0
16384: 16384, 0, 38, 65, 23218, 0, 0
32768: 32768, 0, 139, 129, 21968, 0, 0
65536: 65536, 0, 115, 49, 119907, 0, 0
64 pcpu: 8, 0, 4020, 4172, 4118, 0, 0
SLEEPQUEUE: 80, 0, 2131, 2767, 2131, 0, 0
Files: 80, 0, 454, 4796,28333466, 0, 0
filedesc0: 1104, 0, 152, 433, 1064769, 0, 0
TURNSTILE: 136, 0, 2131, 1289, 2131, 0, 0
rl_entry: 40, 0, 561, 7539, 561, 0, 0
umtx pi: 96, 0, 0, 0, 0, 0, 0
umtx_shm: 88, 0, 31217, 3928, 34273, 0, 0
MAC labels: 40, 0, 0, 0, 0, 0, 0
PROC: 1424, 0, 152, 434, 1064768, 0, 0
THREAD: 1552, 0, 1704, 426, 9448, 0, 0
cpuset: 96, 0, 1208, 3712, 1529, 0, 0
audit_record: 1248, 0, 0, 4929,44799582, 0, 0
mbuf_packet: 256, 104630070, 24552, 12386, 4231181, 0, 0
mbuf: 256, 104630070, 3, 12394, 8548105, 0, 0
mbuf_cluster: 2048, 16348446, 36938, 138, 60263, 0, 0
mbuf_jumbo_page: 4096, 8174223, 0, 141, 542447, 0, 0
mbuf_jumbo_9k: 9216, 2421992, 0, 0, 0, 0, 0
mbuf_jumbo_16k: 16384, 1362370, 0, 0, 0, 0, 0
DMAR_MAP_ENTRY: 120, 0, 0, 0, 0, 0, 0
ttyinq: 160, 0, 360, 765, 990, 0, 0
ttyoutq: 256, 0, 188, 667, 517, 0, 0
FPU_save_area: 832, 0, 0, 0, 0, 0, 0
g_bio: 376, 0, 0, 2720,11874693, 0, 0
nvme_request: 128, 0, 4, 4398, 26030, 0, 0
vtnet_tx_hdr: 24, 0, 0, 0, 0, 0, 0
cryptop: 88, 0, 0, 4005, 647540, 0, 0
cryptodesc: 72, 0, 0, 4015, 647540, 0, 0
taskq_zone: 48, 0, 0, 35192, 2413940, 0, 0
VNODE: 472, 0, 142242, 1286, 601710, 0, 0
VNODEPOLL: 120, 0, 2, 625, 9, 0, 0
BUF TRIE: 144, 0, 0, 105948, 0, 0, 0
S VFS Cache: 108, 0, 89310, 2950, 1595636, 0, 0
STS VFS Cache: 148, 0, 0, 0, 0, 0, 0
L VFS Cache: 328, 0, 67, 1037, 1791, 0, 0
LTS VFS Cache: 368, 0, 0, 0, 0, 0, 0
NAMEI: 1024, 0, 0, 704,154021593, 0, 0
rentr: 24, 0, 0, 8183, 2858, 0, 0
NCLNODE: 576, 0, 0, 0, 0, 0, 0
DIRHASH: 1024, 0, 0, 0, 0, 0, 0
pipe: 760, 0, 32, 568, 297208, 0, 0
procdesc: 136, 0, 1, 1710, 897, 0, 0
AIO: 208, 0, 0, 0, 0, 0, 0
AIOP: 32, 0, 0, 0, 0, 0, 0
AIOCB: 752, 0, 0, 0, 0, 0, 0
AIOLIO: 280, 0, 0, 0, 0, 0, 0
range_seg_cache: 72, 0, 1117410, 81040,23301057, 0, 0
metaslab_alloc_trace_cache: 72, 0, 0, 10065, 4408239, 0, 0
zio_cache: 1048, 0, 73, 115556,241609900, 0, 0
zio_link_cache: 48, 0, 0, 116200,64873334, 0, 0
zio_buf_512: 512, 0, 555291, 1877,54193375, 0, 0
zio_data_buf_512: 512, 0, 1805, 1331, 1491667, 0, 0
zio_buf_1024: 1024, 0, 1159, 24081, 224958, 0, 0
zio_data_buf_1024: 1024, 0, 968, 632, 29721, 0, 0
zio_buf_1536: 1536, 0, 342, 24328, 210638, 0, 0
zio_data_buf_1536: 1536, 0, 1129, 401, 49580, 0, 0
zio_buf_2048: 2048, 0, 48731, 531, 630275, 0, 0
zio_data_buf_2048: 2048, 0, 2176, 160, 21869, 0, 0
zio_buf_2560: 2560, 0, 64, 119, 179116, 0, 0
zio_data_buf_2560: 2560, 0, 2181, 132, 20371, 0, 0
zio_buf_3072: 3072, 0, 46, 110, 177331, 0, 0
zio_data_buf_3072: 3072, 0, 1720, 116, 14832, 0, 0
zio_buf_3584: 3584, 0, 40, 50, 174997, 0, 0
zio_data_buf_3584: 3584, 0, 1262, 94, 15376, 0, 0
zio_buf_4096: 4096, 0, 527, 9091, 8627750, 0, 0
zio_data_buf_4096: 4096, 0, 1139, 376, 236336, 0, 0
zio_buf_5120: 5120, 0, 752, 145, 345173, 0, 0
zio_data_buf_5120: 5120, 0, 1584, 136, 28763, 0, 0
zio_buf_6144: 6144, 0, 5941, 3678, 346446, 0, 0
zio_data_buf_6144: 6144, 0, 1059, 61, 24013, 0, 0
zio_buf_7168: 7168, 0, 167, 177, 325035, 0, 0
zio_data_buf_7168: 7168, 0, 880, 56, 21846, 0, 0
zio_buf_8192: 8192, 0, 1110, 1539, 871622, 0, 0
zio_data_buf_8192: 8192, 0, 698, 81, 24196, 0, 0
zio_buf_10240: 10240, 0, 131387, 66570, 521772, 0, 0
zio_data_buf_10240: 10240, 0, 1177, 39, 47154, 0, 0
zio_buf_12288: 12288, 0, 23008, 22661, 630894, 0, 0
zio_data_buf_12288: 12288, 0, 1499, 252, 39611, 0, 0
zio_buf_14336: 14336, 0, 4, 108, 5683, 0, 0
zio_data_buf_14336: 14336, 0, 884, 532, 32480, 0, 0
zio_buf_16384: 16384, 0, 321809, 122568, 1774887, 0, 0
zio_data_buf_16384: 16384, 0, 826, 67, 32998, 0, 0
zio_buf_20480: 20480, 0, 24, 131, 109109, 0, 0
zio_data_buf_20480: 20480, 0, 1414, 97, 49762, 0, 0
zio_buf_24576: 24576, 0, 5, 201, 208036, 0, 0
zio_data_buf_24576: 24576, 0, 788, 94, 46730, 0, 0
zio_buf_28672: 28672, 0, 0, 517, 260314, 0, 0
zio_data_buf_28672: 28672, 0, 563, 50, 44097, 0, 0
zio_buf_32768: 32768, 0, 2, 121, 71174, 0, 0
zio_data_buf_32768: 32768, 0, 485, 59, 44268, 0, 0
zio_buf_40960: 40960, 0, 5, 189, 120351, 0, 0
zio_data_buf_40960: 40960, 0, 846, 148, 88458, 0, 0
zio_buf_49152: 49152, 0, 2, 496, 114293, 0, 0
zio_data_buf_49152: 49152, 0, 1252, 91, 36323, 0, 0
zio_buf_57344: 57344, 0, 0, 206, 93761, 0, 0
zio_data_buf_57344: 57344, 0, 617, 298, 21824, 0, 0
zio_buf_65536: 65536, 0, 1, 213, 50490, 0, 0
zio_data_buf_65536: 65536, 0, 504, 96, 20772, 0, 0
zio_buf_81920: 81920, 0, 2, 124, 76214, 0, 0
zio_data_buf_81920: 81920, 0, 772, 227, 38083, 0, 0
zio_buf_98304: 98304, 0, 1, 139, 77399, 0, 0
zio_data_buf_98304: 98304, 0, 568, 79, 33028, 0, 0
zio_buf_114688: 114688, 0, 2, 122, 69859, 0, 0
zio_data_buf_114688: 114688, 0, 467, 70, 34427, 0, 0
zio_buf_131072: 131072, 0, 126330, 2110, 803167, 0, 0
zio_data_buf_131072: 131072, 0, 38681, 195, 542909, 0, 0
zio_buf_163840: 163840, 0, 0, 119, 68167, 0, 0
zio_data_buf_163840: 163840, 0, 0, 0, 0, 0, 0
zio_buf_196608: 196608, 0, 0, 116, 26793, 0, 0
zio_data_buf_196608: 196608, 0, 0, 0, 0, 0, 0
zio_buf_229376: 229376, 0, 0, 44, 8159, 0, 0
zio_data_buf_229376: 229376, 0, 0, 0, 0, 0, 0
zio_buf_262144: 262144, 0, 0, 43, 5843, 0, 0
zio_data_buf_262144: 262144, 0, 0, 0, 0, 0, 0
zio_buf_327680: 327680, 0, 0, 44, 7784, 0, 0
zio_data_buf_327680: 327680, 0, 0, 0, 0, 0, 0
zio_buf_393216: 393216, 0, 0, 42, 4732, 0, 0
zio_data_buf_393216: 393216, 0, 0, 0, 0, 0, 0
zio_buf_458752: 458752, 0, 0, 42, 3733, 0, 0
zio_data_buf_458752: 458752, 0, 0, 0, 0, 0, 0
zio_buf_524288: 524288, 0, 0, 42, 1996, 0, 0
zio_data_buf_524288: 524288, 0, 0, 0, 0, 0, 0
zio_buf_655360: 655360, 0, 0, 109, 2377, 0, 0
zio_data_buf_655360: 655360, 0, 0, 0, 0, 0, 0
zio_buf_786432: 786432, 0, 0, 41, 959, 0, 0
zio_data_buf_786432: 786432, 0, 0, 0, 0, 0, 0
zio_buf_917504: 917504, 0, 0, 38, 478, 0, 0
zio_data_buf_917504: 917504, 0, 0, 0, 0, 0, 0
zio_buf_1048576: 1048576, 0, 0, 39, 556, 0, 0
zio_data_buf_1048576: 1048576, 0, 0, 0, 0, 0, 0
zio_buf_1310720: 1310720, 0, 0, 3, 4, 0, 0
zio_data_buf_1310720: 1310720, 0, 0, 0, 0, 0, 0
zio_buf_1572864: 1572864, 0, 0, 0, 0, 0, 0
zio_data_buf_1572864: 1572864, 0, 0, 0, 0, 0, 0
zio_buf_1835008: 1835008, 0, 0, 0, 0, 0, 0
zio_data_buf_1835008: 1835008, 0, 0, 0, 0, 0, 0
zio_buf_2097152: 2097152, 0, 0, 0, 0, 0, 0
zio_data_buf_2097152: 2097152, 0, 0, 0, 0, 0, 0
zio_buf_2621440: 2621440, 0, 0, 0, 0, 0, 0
zio_data_buf_2621440: 2621440, 0, 0, 0, 0, 0, 0
zio_buf_3145728: 3145728, 0, 0, 0, 0, 0, 0
zio_data_buf_3145728: 3145728, 0, 0, 0, 0, 0, 0
zio_buf_3670016: 3670016, 0, 0, 0, 0, 0, 0
zio_data_buf_3670016: 3670016, 0, 0, 0, 0, 0, 0
zio_buf_4194304: 4194304, 0, 0, 0, 0, 0, 0
zio_data_buf_4194304: 4194304, 0, 0, 0, 0, 0, 0
zio_buf_5242880: 5242880, 0, 0, 0, 0, 0, 0
zio_data_buf_5242880: 5242880, 0, 0, 0, 0, 0, 0
zio_buf_6291456: 6291456, 0, 0, 0, 0, 0, 0
zio_data_buf_6291456: 6291456, 0, 0, 0, 0, 0, 0
zio_buf_7340032: 7340032, 0, 0, 0, 0, 0, 0
zio_data_buf_7340032: 7340032, 0, 0, 0, 0, 0, 0
zio_buf_8388608: 8388608, 0, 0, 0, 0, 0, 0
zio_data_buf_8388608: 8388608, 0, 0, 0, 0, 0, 0
zio_buf_10485760: 10485760, 0, 0, 0, 0, 0, 0
zio_data_buf_10485760: 10485760, 0, 0, 0, 0, 0, 0
zio_buf_12582912: 12582912, 0, 0, 0, 0, 0, 0
zio_data_buf_12582912: 12582912, 0, 0, 0, 0, 0, 0
zio_buf_14680064: 14680064, 0, 0, 0, 0, 0, 0
zio_data_buf_14680064: 14680064, 0, 0, 0, 0, 0, 0
zio_buf_16777216: 16777216, 0, 0, 0, 0, 0, 0
zio_data_buf_16777216: 16777216, 0, 0, 0, 0, 0, 0
lz4_ctx: 16384, 0, 0, 167, 2371452, 0, 0
abd_chunk: 4096, 0, 1498299, 193360,66794913, 0, 0
sa_cache: 144, 0, 117910, 1970, 610708, 0, 0
dnode_t: 728, 0, 1118459, 1106, 2539158, 0, 0
arc_buf_hdr_t_full: 224, 0, 985562, 1662,19179064, 0, 0
arc_buf_hdr_t_l2only: 88, 0, 0, 4005, 25678, 0, 0
arc_buf_t: 64, 0, 729765, 134763,19547924, 0, 0
dmu_buf_impl_t: 240, 0, 1236172, 97444,16369819, 0, 0
zil_lwb_cache: 320, 0, 3, 345, 54, 0, 0
zil_zcw_cache: 80, 0, 0, 1650, 51, 0, 0
sio_cache: 128, 0, 0, 0, 0, 0, 0
zfs_znode_cache: 264, 0, 117910, 1235, 553283, 0, 0
Mountpoints: 816, 0, 24268, 202, 24268, 0, 0
ksiginfo: 112, 0, 570, 6570, 61686, 0, 0
itimer: 352, 0, 1, 175, 9, 0, 0
KNOTE: 128, 0, 45, 3458, 66190, 0, 0
socket: 872, 8384332, 130, 562, 923394, 0, 0
ipq: 56, 51262, 0, 0, 0, 0, 0
udp_inpcb: 464, 8384336, 25, 1199, 274109, 0, 0
udpcb: 32, 8384375, 25, 8100, 274109, 0, 0
tcp_inpcb: 464, 8384336, 36, 892, 36547, 0, 0
tcpcb: 1048, 8384334, 36, 270, 36547, 0, 0
tcptw: 88, 27810, 0, 2880, 897, 0, 0
syncache: 160, 15375, 0, 1900, 34233, 0, 0
hostcache: 128, 15376, 21, 1126, 23, 0, 0
sackhole: 32, 0, 0, 0, 0, 0, 0
tcpreass: 48, 1021813, 0, 1328, 9, 0, 0
sctp_ep: 1248, 8384334, 0, 0, 0, 0, 0
sctp_asoc: 2408, 40000, 0, 0, 0, 0, 0
sctp_laddr: 48, 80012, 0, 1577, 6, 0, 0
sctp_raddr: 736, 80000, 0, 0, 0, 0, 0
sctp_chunk: 152, 400010, 0, 0, 0, 0, 0
sctp_readq: 152, 400010, 0, 0, 0, 0, 0
sctp_stream_msg_out: 112, 400015, 0, 0, 0, 0, 0
sctp_asconf: 40, 400000, 0, 0, 0, 0, 0
sctp_asconf_ack: 48, 400060, 0, 0, 0, 0, 0
udplite_inpcb: 464, 8384336, 0, 0, 0, 0, 0
ripcb: 464, 8384336, 2, 902, 1792, 0, 0
unpcb: 240, 8384336, 65, 2447, 610936, 0, 0
IPsec SA lft_c: 16, 0, 0, 0, 0, 0, 0
rtentry: 208, 0, 18, 723, 20, 0, 0
selfd: 64, 0, 677, 8995,47464291, 0, 0
swpctrie: 144, 32696973, 0, 0, 0, 0, 0
swblk: 136, 32696949, 0, 0, 0, 0, 0
pf mtags: 48, 0, 0, 2822, 41, 0, 0
pf states: 296, 100009, 200, 1113, 109141, 0, 0
pf state keys: 88, 0, 200, 5065, 109141, 0, 0
pf source nodes: 136, 50025, 0, 0, 0, 0, 0
pf table entries: 216, 200016, 18, 54, 18, 0, 0
pf frags: 112, 0, 0, 0, 0, 0, 0
pf frag entries: 40, 50000, 0, 0, 0, 0, 0
pf state scrubs: 40, 0, 0, 0, 0, 0, 0
If I compare that one from a production server that has been running for 56 days and have about 500 active users (mostly SMB but some NFS), then the zio_* values that differ much are:
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
zio_buf_512: 512, 0, 23957617, 20604807, 28379999753, 0, 0
zio_buf_2048: 2048, 0, 48613, 10045, 152698166, 0, 0
zio_buf_5120: 5120, 0, 147618, 145002, 114926222, 0, 0
zio_buf_6144: 6144, 0, 213681, 296569, 96756130, 0, 0
zio_buf_10240: 10240, 0, 15485, 28955, 26953062, 0, 0
zio_buf_12288: 12288, 0, 8795, 29201, 383299702, 0, 0. (zio_buf_cache[20])
zio_buf_16384: 16384, 0, 1300372, 391818, 1519180982, 0, 0. (zio_buf_cache[28])
The test server for comparison:
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
zio_buf_512: 512, 0, 555291, 1877, 54193375, 0, 0
zio_buf_5120: 5120, 0, 752, 145, 345173, 0, 0
zio_buf_6144: 6144, 0, 5941, 3678, 346446, 0, 0
zio_buf_10240: 10240, 0, 131387, 66570, 521772, 0, 0
zio_buf_12288: 12288, 0, 23008, 22661, 630894, 0, 0
zio_buf_16384: 16384, 0, 321809, 122568, 1774887, 0, 0
If the test server at 7 hours uptime takes 10 minutes to clean up the zio_buffers, then I can see why it sometimes will seem to take forever on a production server and why we thought it had freezed up on us (we usually give up and send it a hard reset after a while).
hmm.. I wonder if perhaps one could add a progress counter to uma_destroy() somewhere. *Reads the code some more* :-)
- Peter
> On 2 Dec 2019, at 23:54, Mark Johnston <markj at freebsd.org> wrote:
>
> On Mon, Dec 02, 2019 at 11:39:01PM +0100, Peter Eriksson wrote:
>> Sigh.
>>
>> Slight correction, the output below should have said uma_zdestroy() and not uma_zfree_arg() (wrong printf text, but the right times).
>>
>> After an uptime of 7 hours, a reboot have these times (I removed the “uma” printf in this run):
>>
>> kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds
>> kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds
>> kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds
>> kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds
>> kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds
>> kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds
>> kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds
>> kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds
>>
>> This is on a mostly idle server (well, apart from compiling the kernel code :-) and some snapshots being taken of all filesystems (once per hour).
>>
>>
>> So now on to finding out why uma_destroy() is taking so long… :-).
>
> uma_destroy() frees all of the memory cached in the zone back to the
> page allocator. This operation takes time proportional to the number of
> cached items. I would expect most of the time to be spent in
> zone_reclaim(), called by zone_dtor().
>
> Before rebooting, could you try grabbing the output of vmstat -z?
More information about the freebsd-fs
mailing list