FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours
Allan Jude
allanjude at freebsd.org
Sun Jun 11 03:39:54 UTC 2017
On 06/10/2017 12:36, Slawa Olhovchenkov wrote:
> On Sat, Jun 10, 2017 at 04:25:59PM +0000, Caza, Aaron wrote:
>
>> Gents,
>>
>> I'm experiencing an issue where iterating over a PostgreSQL table of ~21.5 million rows (select count(*)) goes from ~35 seconds to ~635 seconds on Intel 540 SSDs. This is using a FreeBSD 10 amd64 stable kernel back from Jan 2017. SSDs are basically 2 drives in a ZFS mirrored zpool. I'm using PostgreSQL 9.5.7.
>>
>> I've tried:
>>
>> * Using the FreeBSD10 amd64 stable kernel snapshot of May 25, 2017.
>>
>> * Tested on half a dozen machines with different models of SSDs:
>>
>> o Intel 510s (120GB) in ZFS mirrored pair
>>
>> o Intel 520s (120GB) in ZFS mirrored pair
>>
>> o Intel 540s (120GB) in ZFS mirrored pair
>>
>> o Samsung 850 Pros (256GB) in ZFS mirrored pair
>>
>> * Using bonnie++ to remove Postgres from the equation and performance does indeed drop.
>>
>> * Rebooting server and immediately re-running test and performance is back to original.
>>
>> * Tried using Karl Denninger's patch from PR187594 (which took some work to find a kernel that the FreeBSD10 patch would both apply and compile cleanly against).
>>
>> * Tried disabling ZFS lz4 compression.
>>
>> * Ran the same test on a FreeBSD9.0 amd64 system using PostgreSQL 9.1.3 with 2 Intel 520s in ZFS mirrored pair. System had 165 days uptime and test took ~80 seconds after which I rebooted and re-ran test and was still at ~80 seconds (older processor and memory in this system).
>>
>> I realize that there's a whole lot of info I'm not including (dmesg, zfs-stats -a, gstat, et cetera): I'm hoping some enlightened individual will be able to point me to a solution with only the above to go on.
>
> Just a random guess: can you try r307264 (I am mean regression in
> r307266)?
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
>
This sounds a bit like an issue I investigated for a customer a few
months ago.
Look at gstat -d (includes DELETE operations like TRIM)
If you see a lot of that happening, but try: vfs.zfs.trim.enabled=0
in /boot/loader.conf and see if your issues go away.
the FreeBSD TRIM code for ZFS basicallys waits until the sector has been
free for a while (to avoid doing a TRIM on a block we'll immediately
reuse), so your benchmark will run file for a little while, then
suddenly the TRIM will kick in.
For postgres, fio, bonnie++ etc, make sure the ZFS dataset you are
storing the data on / benchmarking has a recordsize that matches the
workload.
If you are doing a write-only benchmark, and you see lots of reads in
gstat, you know you are having to do read/modify/write's, and that is
why your performance is so bad.
--
Allan Jude
More information about the freebsd-hackers
mailing list