rig_close being synchronous won't (shouldn't) matter. The problem is that hamlib recovers from the timeout but when it's done WSJT-X has already timed out and shuts down rigctld. And it also looks like WSJT-X starts another connection before closing the first one which then gives the invalid configuration error....maybe that's where a synchronous close would work better....but it looks like the idea there was a client just tells rigctld "I'm done" and that's a 1-way conversation. If we make it synchronous then the client has to stay around to take the answer or else we start hitting timeouts in rigctld. I'm a bit reluctant to make it synchronous as I don't know what all the client programs are expecting.And if we make it synchronous and rigctld goes away then the client will be timing out trying to close it. All timeouts are arbitrary but have to be based on what the hardware can do. . Making the timeout = pollrate is arbitrary too. If you made the polling 10ms everything would time out like crazy. We've got a lot of rigs that just aren't very fast on CAT and 1 second is too short. The timeouts aren't hit very often and nobody will even notice it in WSJT-X if there's a 3-second timeout that almost never gets triggered.What they do notice is WSJT-X timing out too quickly. And the more data we ask from the rigs the more likely we are to hit the 1 second timeout. Mike
On Friday, April 24, 2020, 08:18:50 AM CDT, Bill Somerville <g4...@classdesign.com> wrote: Hi Mike, I really dislike arbitrary timeouts, they make software seem unresponsive. Why is rig_close() not synchronous? 73 Bill G4WJS. On 24/04/2020 14:04, Black Michael via wsjt-devel wrote: Direct control has the same problem. And you don't need a network-based rig for this problem. I duplicated it on my MK706MKIIG over serial just by turning the rig off, wait for the timeout, turn it back on, and retry. I think this is why we keep seeing all these random errors for getting vfo/freq etc.....it's been borderline for a long time. JTDX has added getting power levels and it is popping up there a lot now due to the added commands. The polling interval/timeout applies to any communication....so having a longer timeout there unrelated to polling rate would be better. I would vote at least 3 seconds. The longest timeout we have in hamlib right now is 10 seconds for trxmanager. We have 2.5 seconds in netrigctl.c And lots of 2 second timeouts. Note that even those can have multiple retries and that's where the 600ms timeout in the Flex started being triggered during profile changes with WSJT-X timing out on top of it. Mike On Friday, April 24, 2020, 05:51:18 AM CDT, Bill Somerville <g4...@classdesign.com> wrote: Hi Mike, I was think about the TCP/IP connection to the rig, the rig_close() needs to shutdown and close that rather than be still waiting. In that case the close should be processed promptly. I don't have a FlexRadio or any other network connected rig. Another question, what happens when rigctld is not included? I ask because direct control must appear to behave exactly the same as indirect control via rigctld, if that is not he case then there is a more serious issue there. 73 Bill G4WJS. On 24/04/2020 04:21, Black Michael via wsjt-devel wrote: Via the rigctld interface would be the "q" command getting sent by WSJT-X via rig_close()...but the rigctld thread would be locked by the mutex while it's still waiting for response from the rig. But the "q" is fire-and-forget so WSJT-X could still close the port which may solve the problem with the invalid configuration. static int netrigctl_close(RIG *rig) { rig_debug(RIG_DEBUG_VERBOSE, "%s called\n", __func__); /* clean signoff, no read back */ write_block(&rig->state.rigport, "q\n", 2); return RIG_OK; } Did you try and duplicate this by turning off your rig? Mike On Thursday, April 23, 2020, 06:49:50 PM CDT, Bill Somerville <g4...@classdesign.com> wrote: Hi Mike, are we dealing with a long wait for the TCP/IP connection to close down? IIRC if the client end initiates the shutdown of a TCP/IP connection there should be minimum delay. rig_close() will have been called before rig_open() is called again on the same RIG handle. 73 Bill G4WJS. On 23/04/2020 22:20, Black Michael via wsjt-devel wrote: Actually I just repeated it. Has to do with the timeout.....using rigctld....or any network-based rig. So if the rig takes too long to respond...then you get the timeout...the rig wakes up again...you click the Retry and get invalid configuration because the rig is till open. You may be able to duplicate this by turning off your rig, wait for the timeout, and turn it back on, then click Retry. de Mike W9MDB On Thursday, April 23, 2020, 04:10:51 PM CDT, Black Michael via wsjt-devel <wsjt-devel@lists.sourceforge.net> wrote: Don't have any more details than the error message of invalid configuration. I've seen it on my system too once in a while but never repeatable of course. I think it may have to do with WSJT-X opening a 2nd connection without closing the 1st connection as I saw some of that in recent debug logs. I do notice that do_start does not ensure that the rig is closed int HamlibTransceiver::do_start () { TRACE_CAT ("HamlibTransceiver", QString::fromLatin1(rig_->caps->mfg_name).trimmed () << QString::fromLatin1(rig_->caps->model_name).trimmed ()); error_check (rig_open (rig_.data ()), tr ("opening connection to rig")); Mike On Thursday, April 23, 2020, 02:40:48 PM CDT, Bill Somerville <g4...@classdesign.com> wrote: On 23/04/2020 20:15, Black Michael via wsjt-devel wrote: > And sometimes doing retry will give a configuration error too which > requires a restart of WSJT-X...also undesirable. > > Mike Hi Mike, that's the first report of that, can you supply more details please? 73 Bill G4WJS. _______________________________________________ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel
_______________________________________________ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel