git: 15183f36e5a0 - stable/13 - amd64: Stop using REP MOVSB for backward memmove()s.
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 30 Jun 2022 01:15:56 UTC
The branch stable/13 has been updated by mav: URL: https://cgit.FreeBSD.org/src/commit/?id=15183f36e5a0b2ab1febdb7e88d20c0a3d3ab86e commit 15183f36e5a0b2ab1febdb7e88d20c0a3d3ab86e Author: Alexander Motin <mav@FreeBSD.org> AuthorDate: 2022-06-16 17:01:12 +0000 Commit: Alexander Motin <mav@FreeBSD.org> CommitDate: 2022-06-30 01:15:49 +0000 amd64: Stop using REP MOVSB for backward memmove()s. Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes REP MOVSB the fastest way to copy memory in most of cases. However Intel Optimization Reference Manual says: "setting the DF to force REP MOVSB to copy bytes from high towards low addresses will expe- rience significant performance degradation". Measurements on Intel Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs. This patch keeps ERMS use for forward ordered memory copies, but removes it for backward overlapped moves where it does not work. Reviewed by: mjg MFC after: 2 weeks (cherry picked from commit 6210ac95a19416832601b571409a3e08b76d107f) --- sys/amd64/amd64/support.S | 8 -------- 1 file changed, 8 deletions(-) diff --git a/sys/amd64/amd64/support.S b/sys/amd64/amd64/support.S index 15f72a425cf1..09e73800bddd 100644 --- a/sys/amd64/amd64/support.S +++ b/sys/amd64/amd64/support.S @@ -507,13 +507,6 @@ END(memcmp) ALIGN_TEXT 2256: std -.if \erms == 1 - leaq -1(%rdi,%rcx),%rdi - leaq -1(%rsi,%rcx),%rsi - rep - movsb - cld -.else leaq -8(%rdi,%rcx),%rdi leaq -8(%rsi,%rcx),%rsi shrq $3,%rcx @@ -523,7 +516,6 @@ END(memcmp) movq %rdx,%rcx andb $7,%cl jne 2004b -.endif \end ret .endif