On 07/06/2020 02:23, Philip Gladstone wrote:
There are a (small) number of WSJT-X users who have difficulty
reporting their spots to pskreporter. Some of these are in "difficult"
areas of network connectivity (e.g. Marine Mobile) and I suspect that
the UDP transport is losing most of their packets. The general loss
rate seems to be around 1%-2% which is somewhat higher than I would
expect, but it is not unbelievableĀ either.
It is also difficult to diagnose these sort of problems as the packets
appear to leave the PC running WSJT-X and not arrive at my server!
PSKReporter was never supposedĀ to be 100% reliable, but there seem to
be a lot of people who think otherwise....
In an effort to improve the situation, I have now stood up a TCP
listener that might help. The protocol is identical -- the only
difference is that you send the same messages as before over a TCP
connection to report.pskreporter.info <http://report.pskreporter.info>
port 4739 rather than over a UDP connection. There is no extra framing
required as the messages already contain a length code.
The listening server should be able to support enough connections. It
will close a connection if an invalid message is received.
Is this change something that could be implemented? Also, currently,
you send a bunch of packets at the same time (on the five minute
expiry). You could send them as soon as they get "full" rather than
waiting.
Thanks
Philip
Hi Philip,
I have a test version of WSJT-X that can use TCP/IP to send spots to
PSKReporter, it seems to work ok on the test server, I have not tried it
on the main server yet. I have some thoughts and questions on yoour
issues and suggestions above.
* WSJT-X builds UDP datagrams roughly 1400 bytes long. This could be
the root problem of some users not getting traffic through to
PSKReporter, due to packet fragmentation. IPv4 UDP allows
fragmentation if routers need to, but receivers will drop any
reassembled datagram if any fragment is lost, that thereby increases
the likelihood of lost datagrams somewhat. Routers may also just
drop UDP datagrams over a certain size although I have no idea if
that really happens. The best recommendations for UDP datagram
payload size that guarantees deliverability (note *not* guarantee
delivery, that's never the case with UDP) is probably 508 bytes. One
of the changes I've implemented is to finally follow your
recommendation to only send template descriptors in the first three
datagrams of a session and once per hour thereafter. I have also
implemented not sending the receiver data set unless there is a
change to the information, again sent every hour even if there is no
change to the data. These changes would mean that a datagram payload
size of 508 is reasonably practical. It would mean sending more
datagrams, probably after shorter intervals, say at least one per
minute, but overall less traffic volume.
* Although I understand your suggestion to send datagrams when they
are full, that is not easy to implement without causing server load
issues, the reason being that datagrams would tend to be sent during
the decoding phase of WSJT-X. On busy HF bands with FT8 and FT4 that
would tend to generate large spikes of traffic at 15s and 7.5 s
intervals from the top of each minute, respectively. I think using a
variation on the current mechanism with all available spots being
sent in one or more datagrams on a fixed timer interval would best
randomize the traffic flow. The timer origin being based on
application start time alone. Perhaps if the timer interval were 1
minute rather than 5 minutes the flow of spots would be smoothed
somewhat. This is also in line with the UDP datagram size suggestion
above.
* Perhaps a dual interval approach could be used to ensure more filled
datagrams, say WSJT-X checked every minute (or even 30s or less) and
sends as many filled datagrams as it has spots for, then once every
5 minutes it flushes any queued spots including a final partially
filled datagram. That would smooth the flow for high volumes and
still spot every 5 minutes for the spotter monitoring a quiet band.
Other intervals are of course possible - suggestions?
* Using TCP/IP has some merits, but I am not sure there will be any
real gain. It may be better to try smaller UDP datagrams before
TCP/IP. You already have metrics for UDP to detect levels of dropped
datagrams so you could easily assess if smaller datagrams would
solve the missing data issue. For sure if shorter datagrams solve
the root problem then TCP/IP only gains client knowledge of server
outages or network connectivity issue, and guaranteed in-order
delivery when a connection is working. The latter being of no real
value here.
* Also in this thread we discussed fallback strategies if TCP/IP
connections failed. Thinking this through, I don't see any benefit.
WSJT-X is certainly not going to store spots for any extended time
to forward later if the server is not available, and I doubt the UDP
service will be available if the TCP/IP one is down so reverting to
UDP has little value IMHO (although we could do it easily). The
complexity of a failed TCP/IP connection is the process needed to
re-establish the connection, something that is not required with
UDP. I think the best strategy for WSJT-X would be to drop spots on
the floor if a TCP/IP connection were used and the connection
failed. Of course one benefit would be that WSJT-X could inform the
user that spots cannot be delivered, an option that is not available
with UDP.
* Another question with using TCP/IP is how long should WSJT-X keep a
connection to the server open for. We could have a connection that
lives as long as the client program session. Alternatively we could
choose to open a connection for each pass of sending records to PSK
Reporter. The latter does not allow much flexibility for transient
outages, e.g. we might use a 30 s timeout for sending TCP/IP data
but that makes little sense if we are going to close the connection
before that time expires. OTOH a long running connection might add
some unwanted server load to maintain its end of potentially several
thousand connections concurrently. A small benefit of long running
connections is that we could set the SO_KEEPALIVE TCP/IP option
which would let us know if the server has gone away even when we are
not sending spots because of a silent band. Keep alive packets are
normally sent after two hours so there's not going to be any instant
feedback about a server that has gone AWOL.
* I assume there would be no need to send any repeated template
descriptors or receiver data with a TCP/IP connection, other than
perhaps for some sort of server availability handshake.
* We have discussed before whether WSJT-X should send spots where the
grid square is unknown. This would be a considerable traffic
increase, although WSJT-X might mitigate a bit by de-duplicating at
some level. I have not really thought through de-duplicating spots
much, but I suppose keeping a list of spotted calls in the last N
minutes (N yet to be defined) and only spotting if the call is not
on the list. That asks secondary questions about whether the spotted
frequency should qualify the list entry, maybe mode too. That then
begs the question "what frequency resolution", say for example just
band changes in the last N minutes allow re-spotting ... ??? In
summary, what are your feelings on spots with an empty grid square?
Another possibility is to only spot non-standard calls with no grid
square (non-standard in terms of the FT8/T4/MSK144 protocol, or Type
2 compound calls for other block modes like JT9 and JT65). This
would allow those calls to get spotted on PSK Reporter without
hav+ing to send special messages to get spotted (currently they
would have to send a message like "DE <MYSPECIAL> IO91" after CQ
calls or QSOs). The problem for PSK Reporter would be that the spots
without a grid square would need a derived coordinate for plotting,
do you have that capability already, and how robust is it with
special callsigns?
* Sending WSPR spots to PSK Reporter could be implemented in two ways.
The best option would be a hand-over of the wsprnet.org domain so
PSK Reporter gets traffic form all existing sources in the current
format, but I'm not sure that is what is being proposed.
Alternatively a new template could be provided for WSPR spots and
WSJT-X could send to that to report.pskreporter.info as well as
existing traffic to wsprnet.org, or instead of, at the users
discretion. I have no information on the volume of traffic that
would be forthcoming if the second option were taken. I suspect many
sources of wsprnet.org spots are not WSJT-X and they may never add
spotting to PSK Reporter for various reasons.
* I am not certain how PSK Reporter handles IPFIX templates. We
currently use a template that is different from the ones suggested
on your developer information web page:
https://pskreporter.info/pskdev.html. Is that because it is out of
date, or is it that any combination of your IPFIX attributes are
allowed? For WSPR a senderPower attribute would be required with a
range of zero to sixty which is in units of dBm. An 8-bit signed
integer would be fine and allow for any future extension of powers
lower than 0 dBm.
73
Bill
G4WJS.
_______________________________________________
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel