Re: measuring swap partition speed
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 21 Dec 2023 18:36:10 UTC
void <void_at_f-m.fm> wrote on Date: Thu, 21 Dec 2023 15:50:52 UTC : > On Wed, Dec 20, 2023 at 07:48:14PM -0800, Mark Millard wrote: > > ># swapoff /dev/label/growfs_swap > ># dd if=/dev/urandom of=/dev/da0s2b bs=8k count=250000 conv=sync status=progress > >^C478830592 bytes (479 MB, 457 MiB) transferred 22.001s, 22 MB/s > >60557+0 records in > >60556+0 records out > >496074752 bytes transferred in 22.790754 secs (21766491 bytes/sec) > > 22MB/s is usable, I think. In my context, I'd be satisfied with that. > My context differs from yours slightly in that yours is SSD and mine > is spinning rust. I do not have access to spinning rust to test for comparison. Others likely do. > This is unusable: > # dd if=/dev/urandom of=/dev/da0p4 bs=8k count=250000 conv=sync status=progress > ^C11862016 bytes (12 MB, 11 MiB) transferred 40.063s, 296 kB/s My point is that the performance seems to be strongly tied to the media type contributions to the performance. There is no general problem with partition swap based performance. (But the type of test was set up to match yours, not to be realistic for paging activity.) The paging access pattern likely ends up doing lots of seek activity, making for lots of accumulation of latencies. It also is likely is a mix of read and write activity. Mixes of small reads/writes to fairly random palces tends to worsen performance compared to sequential. Paging is not a good match to large sequential writes as the only activity. Perhaps someone with fio background can fully specify how to run a noticeably more realistic benchmark for making swap perforamnce judgments, perhaps monitored via gstat during its operation. > because it's way too slow. Swap never gets fully reclaimed, > thrashing happens, loads of other followon effects happen. > > The same partition formatted as ufs reports 113 MB/s. Multiple swap partitions > have been tested, then converted to ufs. Results are the same. This uses the file system caching and larger than 8K writes --and only writes, not a realistic mixing of reads and writes or the more random distribution of where the reads and writes would be form/to. conv=sync does not prevent the caching effect for sequential activity as I understand. I suggest using: # gstat -spod to get an idea when the actual I/Os are like in each of whatever relevant contexts are of interest (actual operation with the swap performance issue and benchmarking). So far as I can tell only you can provide such information, as the issue is not readily repeatable by others. From a broader view, actual-operation examples from "gstat -spod" output might be of more general interest for your type of context. > There are no reported errors in smartctl. Long smartctl tests run monthly. > > 5 Reallocated_Sector_Ct PO--CK 100 100 050 - 0 > 9 Power_On_Hours -O--CK 001 001 000 - 48992 > 196 Reallocated_Event_Count -O--CK 100 100 000 - 0 > 197 Current_Pending_Sector -O--CK 100 100 000 - 0 > 198 Offline_Uncorrectable ----CK 100 100 000 - 0 > > I can't find any hardware problem here. Possible workarounds, bearing in mind > I'm not versant in C so it's not like I can fix this myself in code: > > 1. swap as swapfile and not partition [a] (1) is subject to "trivial and unavoidable deadlocks". After suffering such, I avoid always avoid this form. > 2. swap as nfs [b] I've never used nfs for this but it likely has the same issue as (1). > 3. swapoff & swapon script running every minute [c] If this ways works for bringing everything into RAM, it seems to be an approximation of not having swap in the first place and would be subbject to (4). > 4. just turn all swap off and reboot after crashing (undesirable) (I tend to have active SWAP partion(s) that has about 3.8*RAM space because of doing a form of high load average "poudriere bulk" runs.) [I have multiple SWAP partitions because of using the same media in various machines that have widely different amounts of RAM. I form a total active swap space that is appropriate to the RAM present for the boot. Other than that I'd use just one partition.] > 5. use another OS that doesn't have this problem You omit the alternative of using media for the swap/paging space that avoids the problem. There are such around. Is there a blocking issue for going the direction of also having a separate swap media that has helpful characteristics? I will note that the RPi4B shares its USB3 bandwidth across the 2 USBC ports: they are not independent channels. Having sustained I/O that competes for the bandwidth can be a bottleneck issue of itself. A similar point can happen at the media level when the swap space I/O and other I/O are to the same media. (For spinning rust, that includes more time spent seeking: additional latency.) > [a] not tried yet, and i hope it works. Legacy info suggests swap as partition is usually > faster than filesystem-based swap. But the reverse might be the case here. > > [b] also not tried. This, I imagine, would be filesystem only (I'm unsure a zfs volume can > be exported to look like a mountable partition to the client) > > [c] https://github.com/Freaky/swapflush.git - usually works but maybe i need to run it every > minute instead of every five mins. For testing, this script was disabled. > > Any additional suggestions on how to overcome this problem gratefully received. === Mark Millard marklmi at yahoo.com