Ellard Roush writes:
> James Carlson wrote:
> > That point in time is as soon as your application can start.  It need
> > not have any dependencies at all.
> > 
> Here is the other point that needs to be clarified.
> This is not an application.
> Applications do not start until much later.
> We have to get the cluster formed and cluster services established
> before applications run.

We probably have different definitions of that term.  For networking,
an "application" is something that uses the services provided by a
transport or (for raw sockets) network layer protocol.

I'm not talking about user applications; just things that use
networking services in some way.

Your program (whatever it is) should not need dependencies on
networking in order to be successful.  As I suggested before, it's
sometimes helpful to listen to routing sockets (you can get hints
there about when it might be a good time to shorten a retry timer, and
thus make your program respond more quickly), but it's not really a
dependency issue.

> The internal interfaces that we had to use are not well documented.
> Your explanation helps understand what is probably going on.

It's hinted at in the documentation, but not as well-documented as it
should be.  man -s 3socket connect says:

     underlying transport provider. Generally, stream sockets can
     successfully connect() only once. Datagram sockets  can  use
     ECONNREFUSED     The  attempt  to  connect  was   forcefully
                      rejected.   The   calling   program  should
                      close(2) the socket descriptor,  and  issue
                      another  socket(3SOCKET)  call  to obtain a
                      new descriptor  before  attempting  another
                      connect() call.

That "generally" is also true for most unsuccessful connect() calls
and the advice under ECONNREFUSED is actually true for pretty much all
failures.  The exceptions are the non-failure "failures" -- EALREADY,
EINPROGRESS, and EWOULDBLOCK.  I think that issue is what the text is
trying to dance around.

You're partly connected (at least bound) after the real failures, and
getting back to a clean state is easiest just by close() and trying

The usual references (Stevens and others) have more detailed
discussions.  The underlying problem is that for much of the BSD
world, the code *is* the documentation, so whatever sockets did, well,
that's what they do.

(For what it's worth, this isn't even one of the darker corners.  Raw
socket behavior, for example, varies in mysterious ways across OS
platforms and even across releases of a given OS.)

James Carlson, Solaris Networking              <[EMAIL PROTECTED]>
Sun Microsystems / 35 Network Drive        71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
zones-discuss mailing list

Reply via email to