Re: CFT: snmalloc as libc malloc
- Reply: Konstantin Belousov : "Re: CFT: snmalloc as libc malloc"
- In reply to: Konstantin Belousov : "Re: CFT: snmalloc as libc malloc"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 13 Feb 2023 12:51:30 UTC
On 09/02/2023 23:09, Konstantin Belousov wrote: > On Thu, Feb 09, 2023 at 09:53:34PM +0100, Mateusz Guzik wrote: >> So, as someone who worked on memcpy previously, I note the variant >> currently implemented in libc is pessimal for sizes > 16 bytes because >> it does not use SIMD. I do have plans to rectify that, but ENOTIME. > > Note that you need two kinds of micro-benchmarks for this: > - normal microbenchmark which does the SIMD-enabled memcpy() in a loop > - a microbenchmark which ensures that the SIMD register file ownership > is re-taken on each iteration (or close to it). > > I am sure that the results from #2 would be astonishing and give quite > different prospective on the use of SIMD for basic libc services. Does FreeBSD still do lazy context switching of SIMD state? I was under the impression that this was disabled by all operating systems now because it exposes speculative side channels across a process boundary. Given that the x86-64 and AArch64 ABIs both pass floating point arguments in SIMD registers by default, I'd be surprised if it gave a performance win - unless a workload manages to avoid passing any floating-point arguments in a quantum, it will hit the trap every time. In addition, unless you explicitly disable it, recent versions of clang will use SIMD registers for inlined memcpy (irrespective of what libc does) and will also now spill GPRs to SIMD registers in preference to the stack in some situations. David