Re: [wsjt-devel] Flex bug

Black Michael via wsjt-devel Fri, 24 Apr 2020 06:43:44 -0700

rig_close being synchronous won't (shouldn't) matter.  The problem is that 
hamlib recovers from the timeout but when it's done WSJT-X has already timed 
out and shuts down rigctld.  And it also looks like WSJT-X starts another 
connection before closing the first one which then gives the invalid 
configuration error....maybe that's where a synchronous close would work 
better....but it looks like the idea there was a client just tells rigctld "I'm 
done" and that's a 1-way conversation.  If we make it synchronous then the 
client has to stay around to take the answer or else we start hitting timeouts 
in rigctld.  I'm a bit reluctant to make it synchronous as I don't know what 
all the client programs are expecting.And if we make it synchronous and rigctld 
goes away then the client will be timing out trying to close it.
All timeouts are arbitrary but have to be based on what the hardware can do. .
Making the timeout = pollrate is arbitrary too.  If you made the polling 10ms 
everything would time out like crazy.
We've got a lot of rigs that just aren't very fast on CAT and 1 second is too 
short. 
The timeouts aren't hit very often and nobody will even notice it in WSJT-X if 
there's a 3-second timeout that almost never gets triggered.What they do notice 
is WSJT-X timing out too quickly.  And the more data we ask from the rigs the 
more likely we are to hit the 1 second timeout.
Mike

    On Friday, April 24, 2020, 08:18:50 AM CDT, Bill Somerville 
<g4...@classdesign.com> wrote:  

  Hi Mike, 
  I really dislike arbitrary timeouts, they make software seem unresponsive. 
  Why is rig_close() not synchronous? 
  73
 Bill
 G4WJS. 
  On 24/04/2020 14:04, Black Michael via wsjt-devel wrote:

   Direct control has the same problem.  And you don't need a network-based rig 
for this problem.  I duplicated it on my MK706MKIIG over serial just by turning 
the rig off, wait for the timeout, turn it back on, and retry. 
  I think this is why we keep seeing all these random errors for getting 
vfo/freq etc.....it's been borderline for a long time. 
  JTDX has added getting power levels and it is popping up there a lot now due 
to the added commands. 
  The polling interval/timeout applies to any communication....so having a 
longer timeout there unrelated to polling rate would be better.  I would vote 
at least 3 seconds. 
  The longest timeout we have in hamlib right now is 10 seconds for trxmanager. 
   We have 2.5 seconds in netrigctl.c And lots of 2 second timeouts. Note that 
even those can have multiple retries and that's where the 600ms timeout in the 
Flex started being triggered during profile changes with WSJT-X timing out on 
top of it. 
  Mike 

      On Friday, April 24, 2020, 05:51:18 AM CDT, Bill Somerville 
<g4...@classdesign.com> wrote:  

     Hi Mike, 
  I was think about the TCP/IP connection to the rig, the rig_close() needs to 
shutdown and close that rather than be still waiting. In that case the close 
should be processed promptly. I don't have a FlexRadio or any other network 
connected rig.

  Another question, what happens when rigctld is not included? I ask because 
direct control must appear to behave exactly the same as indirect control via 
rigctld, if that is not he case then there is a more serious issue there. 
  73
 Bill
 G4WJS. 
   On 24/04/2020 04:21, Black Michael via wsjt-devel wrote:

   Via the rigctld interface would be the "q" command getting sent by WSJT-X 
via rig_close()...but the rigctld thread would be locked by the mutex while 
it's still waiting for response from the rig. 
  But the "q" is fire-and-forget so WSJT-X could still close the port which may 
solve the problem with the invalid configuration. 
    static int netrigctl_close(RIG *rig) {     rig_debug(RIG_DEBUG_VERBOSE, "%s 
called\n", __func__); 
      /* clean signoff, no read back */     write_block(&rig->state.rigport, 
"q\n", 2); 
      return RIG_OK; } 
   Did you try and duplicate this by turning off your rig? 
  Mike 

      On Thursday, April 23, 2020, 06:49:50 PM CDT, Bill Somerville 
<g4...@classdesign.com> wrote:  

     Hi Mike, 
  are we dealing with a long wait for the TCP/IP connection to close down? IIRC 
if the client end initiates the shutdown of a TCP/IP connection there should be 
minimum delay. rig_close() will have been called before rig_open() is called 
again on the same RIG handle. 
  73
 Bill
 G4WJS. 
   On 23/04/2020 22:20, Black Michael via wsjt-devel wrote:

   Actually I just repeated it.   Has to do with the timeout.....using 
rigctld....or any network-based rig. So if the rig takes too long to 
respond...then you get the timeout...the rig wakes up again...you click the 
Retry and get invalid configuration because the rig is till open. 
  You may be able to duplicate this by turning off your rig, wait for the 
timeout, and turn it back on, then click Retry. 
  de Mike W9MDB 

      On Thursday, April 23, 2020, 04:10:51 PM CDT, Black Michael via 
wsjt-devel <wsjt-devel@lists.sourceforge.net> wrote:  

       Don't have any more details than the error message of invalid 
configuration. 
  I've seen it on my system too once in a while but never repeatable of course. 
  I think it may have to do with WSJT-X opening a 2nd connection without 
closing the 1st connection as I saw some of that in recent debug logs. 
  I do notice that do_start does not ensure that the rig is closed 
    int HamlibTransceiver::do_start () {   TRACE_CAT ("HamlibTransceiver",      
        QString::fromLatin1(rig_->caps->mfg_name).trimmed ()              << 
QString::fromLatin1(rig_->caps->model_name).trimmed ()); 
    error_check (rig_open (rig_.data ()), tr ("opening connection to rig"));  

  Mike 

       On Thursday, April 23, 2020, 02:40:48 PM CDT, Bill Somerville 
<g4...@classdesign.com> wrote:  

   On 23/04/2020 20:15, Black Michael via wsjt-devel wrote:
 > And sometimes doing retry will give a configuration error too which 
 > requires a restart of WSJT-X...also undesirable.
 >
 > Mike

 Hi Mike,

 that's the first report of that, can you supply more details please? 

 73
 Bill
 G4WJS.           

 _______________________________________________
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel

_______________________________________________
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel

Re: [wsjt-devel] Flex bug

Reply via email to