Ryan,

Thanks for your suggestions. Your observation of a 0.6% change seems to be down 
in the noise, i.e. comparable to changes that you would expect to see from run 
to run even without any code changes. I tried your patch here on MacOS and the 
results showed that the patch doesn’t provide any significant change in 
execution time. On 3 consecutive runs I saw original->patched execution time 
changes of -1.0%, +0.1%, and +2.4% in 3 successive runs. The average change was 
an increase of 0.5%.

BTW, I noticed that your test script calls the wspr decoder using the default 
value for the timeout value (10000 decoder cycles per bit). This default value 
is not recommended as it will produce an unacceptably high false decode rate 
while providing little, if any, improvement in good decodes compared to more 
reasonable values.  You will see much better results if you call the decoder 
with the following command-line options:  “-C 500 -o 4 -d”.

For example, using your test.sh script as-is I saw 63 good decodes out of 100 
attempts and the total execution time was around 40 seconds on my Mac laptop. 
Using the command-line options “-C 500 -o 4 -d” I saw 87 good decodes and the 
execution time was about 9 seconds, less than 1/4 of the time that it took 
using the default settings.  It’s also worth noting that with my suggested 
settings, the decoding time is dominated by time spent in the Ordered 
Statistics Decoder (OSD). The time spent in the Fano decoder is almost 
negligible (0.63 seconds).

Steve k9an

On May 3, 2024, at 11:07 AM, Tolboom, Ryan via wsjt-devel 
<wsjt-devel@lists.sourceforge.net> wrote:

Good Afternoon,

Here is a patch that eeks out a tiny performance boost for the fano decoder in 
lib/wsprd/fano.c. It does two things:

  1.   It removes an ENCODE call when initializing the root node since encstate 
is set to zero so you know lsym will end up being zero.
  2.  Since the ENCODE macro is operating on 32 bits (POLY1 and POLY2 are 32 
bits and the XOR based parity calcs are 32 bits) encstate in the node struct 
and _tmp in the ENCODE routine are changed to 32 bit integers (unsigned int), 
instead of unsigned long integers. This makes for slightly faster bitwise 
operations.

With these changes I'm getting the same amount of decodes a 0.6% decrease in 
the time spent in the Fano decoder:

=== Original ===
69 decodes
   0.00    0.00    0.04    0.06    0.90   29.38    0.00   31.47

Code segment        Seconds   Frac
-----------------------------------
readwavfile           0.00    0.00
Coarse DT f0 f1       0.00    0.00
sync_and_demod(0)     0.04    0.00
sync_and_demod(1)     0.06    0.00
sync_and_demod(2)     0.90    0.03
Stack/Fano decoder   29.38    0.93
OSD        decoder    0.00    0.00
-----------------------------------
Total                31.47    1.00

=== New ===
69 decodes
   0.00    0.00    0.04    0.06    0.90   29.17    0.00   31.31

Code segment        Seconds   Frac
-----------------------------------
readwavfile           0.00    0.00
Coarse DT f0 f1       0.00    0.00
sync_and_demod(0)     0.04    0.00
sync_and_demod(1)     0.06    0.00
sync_and_demod(2)     0.90    0.03
Stack/Fano decoder   29.17    0.93
OSD        decoder    0.00    0.00
-----------------------------------
Total                31.31    1.00

Given that it has to do with what instructions are used for different word 
sizes you might see more dramatic results on different architectures.

I've also attached the scripts I used to test it out.

On a related note, does anyone still have this dataset?

http://physics.princeton.edu/pulsar/K1JT/wspr_data.tgz

It would be nice to have some non-simulated data.

73,

Ryan N2BP
<test.sh><generate.sh><fano.patch>_______________________________________________
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel

_______________________________________________
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel

Reply via email to