svn commit: r278474 - head/sys/sys
John Baldwin
jhb at freebsd.org
Tue Feb 10 15:58:36 UTC 2015
On Wednesday, February 11, 2015 02:37:05 AM Bruce Evans wrote:
> On Mon, 9 Feb 2015, Jung-uk Kim wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA256
> >
> > On 02/09/2015 16:08, John Baldwin wrote:
> >> On Monday, February 09, 2015 09:03:24 PM John Baldwin wrote:
> >>> Author: jhb Date: Mon Feb 9 21:03:23 2015 New Revision: 278474
> >>> URL: https://svnweb.freebsd.org/changeset/base/278474
> >>>
> >>> Log: Use __builtin_popcnt() to implement a BIT_COUNT() operation
> >>> for bitsets and use this to implement CPU_COUNT() to count the
> >>> number of CPUs in a cpuset.
> >>>
> >>> MFC after: 2 weeks
> >>
> >> Yes, __builtin_popcnt() works with GCC 4.2. It should also allow
> >> the compiler to DTRT in userland uses of this if -msse4.2 is
> >> enabled.
> >
> > Back in 2012, when I submitted a similar patch, bde noted
> > __builtin_popcount*() cannot be used with GCC 4.2 for *kernel* because
> > it emits a library call.
>
> (*) Since generic amd64 and i386 have no popcount instruction in hardware,
> using builtin popcount rarely uses the hardware instruction (it takes
> special -march to get it, and the resulting binaries don't run on generic
> CPUs). Thus using the builtin works worse than using the old inline
> function in most cases. Except, the old inline function is only
> implemented in the kernel, and isn't implemented for 64-bit integers.
>
> gcc-4.8 generates the hardware popcount if the arch supports it. Only
> its library popcounts are slower than clang's. gcc-4.2 presumably
> doesn't generate the hardware popcount, since it doesn't have a -march
> for newer CPUs that have it.
I don't really expect CPU_COUNT() to be used in places where performance is of
the utmost importance. (For example in igb I use it in attach to enumerate
the set of CPUs to bind queues to, but nowhere else.) I can implement a
bitcount64 by using bitcount32 on both halves unless someone has a better
suggestion and we can use the bitcount routines instead of __builtin_popcountl
in BIT_COUNT() for GCC if we care that strongly about it. Alternatively, I'm
happy to implement the libcall for GCC 4.2 for the kernel so that
__builtin_popcountl() works.
--
John Baldwin
More information about the svn-src-head
mailing list