git: 558e8b6adb89 - stable/13 - amd64: Stop using REP MOVSB for backward memmove()s.

Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Thu, 30 Jun 2022 01:14:01 UTC

The branch stable/13 has been updated by mav:

URL: https://cgit.FreeBSD.org/src/commit/?id=558e8b6adb89faeb4c31faf14e88859d8c14c61b

commit 558e8b6adb89faeb4c31faf14e88859d8c14c61b
Author:     Alexander Motin <mav@FreeBSD.org>
AuthorDate: 2022-06-16 18:51:50 +0000
Commit:     Alexander Motin <mav@FreeBSD.org>
CommitDate: 2022-06-30 01:13:51 +0000

    amd64: Stop using REP MOVSB for backward memmove()s.
    
    Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes
    REP MOVSB the fastest way to copy memory in most of cases. However
    Intel Optimization Reference Manual says: "setting the DF to force
    REP MOVSB to copy bytes from high towards low addresses will expe-
    rience significant performance degradation". Measurements on Intel
    Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can
    drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s
    of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs.
    
    This patch keeps ERMS use for forward ordered memory copies, but
    removes it for backward overlapped moves where it does not work.
    
    This is just a cosmetic sync with kernel, since libc does not use
    ERMS at this time.
    
    Reviewed by:    mjg
    MFC after:      2 weeks
    
    (cherry picked from commit f22068d91bf53696ee13a69685e809d35776ec3f)
---
 lib/libc/amd64/string/memmove.S | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/lib/libc/amd64/string/memmove.S b/lib/libc/amd64/string/memmove.S
index 3d75ff45c798..ea92cb18782a 100644
--- a/lib/libc/amd64/string/memmove.S
+++ b/lib/libc/amd64/string/memmove.S
@@ -274,13 +274,6 @@ __FBSDID("$FreeBSD$");
 	ALIGN_TEXT
 2256:
 	std
-.if \erms == 1
-	leaq	-1(%rdi,%rcx),%rdi
-	leaq	-1(%rsi,%rcx),%rsi
-	rep
-	movsb
-	cld
-.else
 	leaq	-8(%rdi,%rcx),%rdi
 	leaq	-8(%rsi,%rcx),%rsi
 	shrq	$3,%rcx
@@ -290,7 +283,6 @@ __FBSDID("$FreeBSD$");
 	movq	%rdx,%rcx
 	andb	$7,%cl
 	jne	2004b
-.endif
 	\end
 	ret
 .endif