On Thu, May 3, 2018 at 7:37 PM, Pradeep Ramachandran <prad...@multicorewareinc.com> wrote: > > On Thu, May 3, 2018 at 2:23 PM, <prav...@multicorewareinc.com> wrote: >> >> # HG changeset patch >> # User Praveen Tiwari <prav...@multicorewareinc.com> >> # Date 1525328839 -19800 >> # Thu May 03 11:57:19 2018 +0530 >> # Branch stable >> # Node ID 9cbb2aadcca3a2f7a308ea1dc792fb817bcc5b51 >> # Parent 69aafa6d70ad4e151f4590766c6b125621c5d007 >> threadpool.cpp: use WIN system call for popcount > > > Unless this fixes a known bug, I don't want to push this directly into > stable. Syscalls are notorious especially when working with older versions > of the OS. > I would rather push this into default and allow users to test that this > works with all kinds of systems and then merge with stable once the answer > is known. > Does this fix a specific issue on some platform, or improve performance?
The comment is not quite right, __popcnt is not a syscall but an MSVC-specific intrinsic. https://msdn.microsoft.com/en-us/library/bb385231.aspx The equivalent gcc intrinsic is __builtin_popcount and friends. I think, the patch is buggy because the relevant field is a 64-bit integer on 64-bit Windows and __popcnt is 32-bit. Note also that the popcount instruction only available in ABM ISA extension. In Intel CPUs it is available since Nehalem. >> diff -r 69aafa6d70ad -r 9cbb2aadcca3 source/common/threadpool.cpp >> --- a/source/common/threadpool.cpp Wed May 02 15:15:05 2018 +0530 >> +++ b/source/common/threadpool.cpp Thu May 03 11:57:19 2018 +0530 >> @@ -71,21 +71,6 @@ >> # define strcasecmp _stricmp >> #endif >> >> -#if defined(_WIN32_WINNT) && _WIN32_WINNT >= _WIN32_WINNT_WIN7 >> -const uint64_t m1 = 0x5555555555555555; //binary: 0101... >> -const uint64_t m2 = 0x3333333333333333; //binary: 00110011.. >> -const uint64_t m3 = 0x0f0f0f0f0f0f0f0f; //binary: 4 zeros, 4 ones ... >> -const uint64_t h01 = 0x0101010101010101; //the sum of 256 to the power of >> 0,1,2,3... >> - >> -static int popCount(uint64_t x) >> -{ >> - x -= (x >> 1) & m1; >> - x = (x & m2) + ((x >> 2) & m2); >> - x = (x + (x >> 4)) & m3; >> - return (x * h01) >> 56; >> -} >> -#endif >> - >> namespace X265_NS { >> // x265 private namespace >> >> @@ -274,7 +259,7 @@ >> for (int i = 0; i < numNumaNodes; i++) >> { >> GetNumaNodeProcessorMaskEx((UCHAR)i, groupAffinityPointer); >> - cpusPerNode[i] = popCount(groupAffinityPointer->Mask); >> + cpusPerNode[i] = __popcnt(static_cast<unsigned >> int>(groupAffinityPointer->Mask)); >> } >> delete groupAffinityPointer; >> #elif HAVE_LIBNUMA >> @@ -623,7 +608,7 @@ >> for (int i = 0; i < numNumaNodes; i++) >> { >> GetNumaNodeProcessorMaskEx((UCHAR)i, &groupAffinity); >> - cpus += popCount(groupAffinity.Mask); >> + cpus += __popcnt(static_cast<unsigned int>(groupAffinity.Mask)); >> } >> return cpus; >> #elif _WIN32 >> _______________________________________________ >> x265-devel mailing list >> x265-devel@videolan.org >> https://mailman.videolan.org/listinfo/x265-devel > > > > _______________________________________________ > x265-devel mailing list > x265-devel@videolan.org > https://mailman.videolan.org/listinfo/x265-devel > _______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel