Hi Joe and all, I can not in any way discuss the data provided but on my machine the difference is more that notable. It is a Pavilion g6 Notebook PC with i5-2430M CPU@2.40Ghz ram 4G with windows 7 SP1 home
This is the diff of the two files: C:\JTSDK\src\wsjtx-1.4>diff CMakeLists.txt CMakeListsMY.txt 429c429 < set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra -std=c++11 -fexceptions -frtti") --- > set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra -std=c++11 -fexceptions -frtti -mtune=native") 476c476 < set (General_FFLAGS "-fbounds-check -Wall -Wno-conversion -fno-second-underscore") --- > set (General_FFLAGS "-fbounds-check -Wall -Wno-conversion -fno-second-underscore -mtune=native") The program run with JT65+JT9 mode. The version is v1.4.0-rc3 r4783[-dirty] I removed the build and install directories files to be sure that all will be rebuild. I increase the number of files to analyze, stop services and anti-virus, but the difference continue to be high. Next week I try on other machines or others OS. I start to think that is not only decoders influence the results. Please note that i refer to the time to complete several loops. With Shift+F6 on a directory the program loop on: - read file - display graph - decode jt65 - show results - decode jt9 - show results - write ALL.TXTand other files I put the mtune=native in the cmake file, so all fortran code but also c will be generated different. It may be that I have to measure the execution time between the decodings to understand what happens. Many thanks for the detailed informations. Next days I'll investigate the strange time difference. 73, Merry Christmas and Happy New Year to You your Family and all the Group Sandro IW3RAB Il 16/12/2014 21:16, Joe Taylor ha scritto: > Hi Alessandro and all, > > Compiler optimizations can be helpful when tuning code for good > performance, but I am surprised that you see anything like a 2x > improvement in decoding speed. > > The following table shows the results of a series of tests I made today > for the decoder (the executable program jt9) running on files like the > ones you created (01.wav, 02.wav, ... 10.wav, all copies of the example > file 130610_2343.wav). The first column lists the Fortran compiler > flags used; the numerical column gives the total execution time (wall > clock) for processing the ten files. > > FFLAGS Time > ------------------------------------------------ > -O0 -fbounds-check 42.4 > -O1 -fbounds-check 22.8 > -Os -fbounds-check 22.8 > -O2 -fbounds-check -funroll-all-loops 20.4 * > -O2 -fbounds-check 20.2 > -O3 -fbounds-check 19.8 > -Ofast -fbounds-check 18.9 > -O2 18.4 > -O2 -mtune=native 18.4 > -O2 -funroll-all-loops 18.2 > -O3 18.0 > -Ofast 17.8 > ------------------------------------------------ > * Used in the release builds of WSJT-X > > As you can see, "-mtune=native" made essentially no difference. The > biggest improvement in execution performance (over the default Release > build) is gained by turning off bounds-checking. A slight additional > improvement is obtained by using -O3 or -Ofast rather than -O2. > However, the total available improvement is less than 15%. > > Obviously, such tests will give different results on different machines. > Those described above were done on a machine with a Core2 Duo E6750 > CPU, 2.66 Ghz. Here is a similar set of results for a Windows machine > (Core i5-2500, 3.3 GHz): > > FFLAGS Time > ------------------------------------------------ > -O0 -fbounds-check 28.5 > -O1 -fbounds-check 18.2 > -O2 -fbounds-check -funroll-all-loops 16.6 * > -O2 -fbounds-check 16.2 > -O3 -fbounds-check 16.2 > -Ofast 15.7 > -O3 -m32 -msse -funroll-all-loops 15.4 > -O3 -mtune=core2 15.1 > -O3 -m32 -msse 15.0 > -O3 -mtune=native 15.0 > ------------------------------------------------ > * Used in the release builds of WSJT-X for Windows > > The flags we're currently using for Windows Release builds give results > within about 10% of the best one listed. > > One way to look at all of this is that the most important optimizations > are those that have already been done, by the programmer. These include > making the best possible choices of data structures, algorithms, loop > ordering, etc., etc. > > -- 73, Joe, K1JT > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk > _______________________________________________ > wsjt-devel mailing list > wsjt-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/wsjt-devel > ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk _______________________________________________ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel