The mental note to take: If anyone is going to provide binary package for Windows, it would make more sense to use ICC instead of MSVC.
Martin Oliver Smith wrote: > On 7/22/2010 4:10 AM, Martin Sustrik wrote: >>> This is a somewhat weak example because the work being done by the >>> worker is so trivial, but even so on a virtual quad-core machine >>> building with -O0 I see a 35-40% reduction in processing time. >>> >> Wrker being trivial, the large reduction in processing time is even more >> impressive. >> > Just to follow up on that, I thought I'd post the findings of my > benchmark comparisons of GCC vs the Intel C Compiler, they're kinda > impressive: > > Virtual Ubuntu 10.04 guest Machine running under VMWare 7.0 on an i7 > host under Windows 7 host, 2 virtual cpus with 2 cores each: > > Async-Worker tests with GCC v4.4.3 with -O3 -msse -msse2 -msse3 -mssse3 > -msse4 -msse4.1 -msse4.2 -mfpmath=sse -mtune=core2 -march=core2: > (NOTE: I used Acovea to find these optimal settings, I wouldn't > ordinarily use -mtune/-march because I always find they make things worse :) > > ~3580ms for serial RunAndReturn, ~3580 for serial RunAndReturnLocal, > ~930ms for parallel RunAndReturn, ~940ms for parallel RunAndReturnLocal > > Async-Worker tests with Intel C++ compiler 11.1 72 with -O3 -xHOST -ipo: > > ~2590ms for serial RunAndReturn, ~2580ms for serial > RunAndReturnLocal, (27% gain) > ~700ms for parallel RunAndReturn, ~700ms for parallel > RunAndReturnLocal (25% gain) > > Building ZeroMQ with "icpc -O3 -ipo -xHOST" instead of GCC shaved an > extra 4-10ms off parallel results. > > Building both Async::Worker examples and ZeroMQ with "icpc -O3 -ipo > -xHOST -fbuiltin" reduces benchmark times by upto 50ms. > > Async-Worker tests with Intel C++ compiler 11.1 72 with -O3 -xHOST -ipo > -fbuiltin and ZeroMQ compiled with same flags: > > ~2510ms for serial RunAndReturn, ~2510ms for serial > RunAndReturnLocal, (30% gain) > ~640ms for parallel RunAndReturn, ~650ms for parallel > RunAndReturnLocal (32% gain) > > Given the trivial workloads, these are fairly impressive benchmarks. > > The Intel C++ compiler is dual-licensed, you can download the Linux > version free > > http://software.intel.com/en-us/intel-compilers/ > > Compared to the Microsoft Visual C++ compiler (2008) we found between > 15-50% performance improvements. The 2010 VSCC is significantly > improved, but Intel's compiler still produces 10-30% improvements. > > You may be aware there was some controversy over the Intel compiler > generating code that didn't work as well on AMD chips: This only > occurred when you built "alternate code paths" for SSE instructions etc, > and the (9.x) version of the compiler would tend not to use the > alternate code paths unless you had an Intel compiler. > > That option is now called "Build Intel specific optimizations", and the > alternate code paths now applies fairly to any CPU that claims to have > the feature set you are targetting. > > - Oliver > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
