Joe - Just now, I committed a revised sfrsd2.c that contains a composite gnnf/hf erasure probability matrix.
I took advantage of the surprisingly complementary distributions of the gnnf and hf symbols when mapped onto the p1-rank, p2/p1 sorting scheme. The result is that the single matrix provides virtually the same results on s3_1000.bin (gnnf) and my hf data set as do the separate gnnf and hf erasure matrices, respectively. A summary of my results with this latest sfrsd2.c: 1. rsdtest, s3_1000.bin, ntrials=10000: 810 decodes (0 bad) (see notes 1 and 2) 2. wsjt-x, my simulated -24dB gnnf files, ntrials=10000: 637 decodes (0 bad) (see note 2) 3. wsjt-x, hf files, ntrials=2000: 2929 decodes (see note 3.) 4. wsjt-x, hf files, ntrials=2000, ‘hf’ erasure matrix: 2942 decodes (see note 3) 5. wsjt-x, hf files, BM only: 2350 decodes Notes: 1. The number of decodes in case 1 can be increased to more than 850 with no errors by loosening the acceptance criteria - but then the probability of bad decode is unacceptably large for hf files. 2. The number of decodes is the same with the optimized ‘gnnf’ erasure matrix and the composite matrix. 3. The number of decodes with the composite matrix is within 0.4% of the optimized matrix. If you are interested in the details - here is the occurrence matrix generated by rsdtest for the s3_1000.bin gnnf data and using the sf symbol metrics: 1 0 0 0 0 0 0 0 270 0 0 0 0 0 0 0 1750 113 11 0 0 0 0 0 2510 1543 491 118 26 5 0 0 1322 2199 1760 934 410 168 52 4 650 1706 2121 2175 1904 1182 720 180 228 839 1577 2129 2142 2286 2572 1618 93 424 864 1468 2342 3183 3480 4169 And here is the corresponding matrix for my set of HF files: 8063 15748 21464 23875 24544 24759 24838 21758 2406 964 324 86 34 16 3 0 2254 943 318 79 35 12 2 0 2672 1283 568 197 51 15 5 0 2193 784 220 46 22 4 1 0 2578 1431 526 142 40 12 9 1 2372 1447 441 92 28 15 6 0 2334 2272 1011 355 118 39 8 4 In these matrices, the row index corresponds to p2/p1 ratio and column index corresponds to rank; the last column corresponds to the lowest-ranked symbols. Note that the first row of the gnnf matrix is nearly empty, whereas the majority of the hf matrix entries are concentrated in the first row (low p2/p1 ratio). So, I created the composite matrix by replacing the first row of the gnnf matrix with the first row of the hf matrix. I don’t think that this is the last word on the erasure probabilities. The lack of overlap between the two distributions means that we haven’t sampled the full range of fading and snr conditions with our training data. The hf distribution was obtained using all symbols, whereas we should probably be using only symbols from vectors that required soft-symbol decoding. We'll need more data to fill in the gaps. Or perhaps it would be sufficient to just combine matrices generated using simulated data at -23, -22, -21 dB snr… Suggestions welcome! Steve k9an ------------------------------------------------------------------------------ _______________________________________________ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel