Ellard Roush writes:
> If we make the code sleep long enough for Solaris routing to
> complete initialization, then after a failed attempt
> to connect, then retries work whenever the route becomes
> available. The problem is that Solaris routing goes into
> an error state when we attempt to connect before it is ready.
OK, it sounds like we're talking at cross-purposes here.
I haven't seen such a problem myself (it sounds like an application
bug to me -- at a wild guess, possibly not handling dynamic interfaces
correctly; see below). File a bug on solaris/kernel/tcp-ip.
The TCP/IP stack itself is responsible for taking user data and
matching it against kernel "routes" (actually, they're forwarding
entries). The user space routing daemons (the things controlled by
SMF) neither know nor _care_ what the kernel is doing with user data
packets, so dependencies on them won't help anything.
Even if some sort of "error state" is possible in the kernel (again, I
haven't seen such a thing, at least not described in those terms), I
don't see how routing daemons are involved here or how anything iSCSI
can do would affect them.
> We are not asking for indication as to when a route is present.
> We want to know when we can attempt to establish a connection
> without Solaris routing going into an error state that
> causes all subsequent attempts to connect to fail.
That point in time is as soon as your application can start. It need
not have any dependencies at all.
If you prefer, you may depend on this service so that at least lo0 is
plumbed up when you start:
Most networking applications don't even need that, though.
> We have found another recovery method for this problem.
> We do not just retry the connection.
> We destroy all network data structures (socket)
> This clears the bad state. retries then eventually succeed.
It sounds to me like you're not dealing with dynamic interfaces
If you don't explicitly bind a preferred address to use (most
applications do not), then the kernel will choose an address for you.
With UDP, this happens on a packet-by-packet basis. With TCP, though,
it happens once as the connect() request is started.
When the kernel does this, it picks the best-matching kernel
forwarding entry (at that moment in time) for the supplied destination
IP address (UDP sendto() or TCP connect()), and then selects a source
address based on the output interface that this entry points to.
Other interfaces may come and go over time, other routes may be
learned or forgotten, but we _never_ go back and rewire that TCP
source address. It perhaps doesn't sound like the best possible
answer, but that's how BSD sockets have worked for many decades, and
it's expected behavior.
If connect() fails or if you need to give up for some reason, there's
no way to unbind. The proper procedure is to close the socket, and
build a new one.
I think you're barking up the wrong tree by attempting to establish
some sort of dependency on routing.
James Carlson, Solaris Networking <[EMAIL PROTECTED]>
Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
zones-discuss mailing list