On 10/26/2013 12:09 PM, Pieter Hintjens wrote: > On Sat, Oct 26, 2013 at 10:53 AM, Arnaud Loonstra <[email protected]> wrote: > >> 1 - Only one node can run on a host (or one per interface) >> 1 - The issue is caused by using a broadcast socket. It's possible to >> bind multiple programs to the same address and port but packets are then >> load balanced between the sockets. > > Is this right? My understanding (and test results) were that when > multiple listeners bound to the same socket, they all received the > incoming messages. The Zyre load tester uses dozens of nodes on the > same host:port. They're definitely all getting each others' packets. >
I did some tests again and I think you are right. I must have mistakenly used this new socket option: https://lwn.net/Articles/542629/ which is actually an option for load balancing. Sorry 'bout that. >> 3 - Nodes are trying to connect to dead peers (peers that are gone) > > This is normal in some senses. ZRE was designed for WiFi and under > load, clients can come and go randomly. To get a resilient network you > need to be optimistic about connecting, and slower to discard peers as > "dead". > > Multicast would be fine; the Android issue is afaik already solved in > Zyre through the tactic of double handshaking on the TCP connection. > I.e. if A can see B, and B cannot see A, then A will connect over TCP > to B and then B will reconnect back to A over TCP (not having seen it > before). > > We can make multicast a configurable option on beacons. I'd not change > the existing default since it'll break applications. > I can understand that. > I'm not so sure about using multicast to cross segments; it might be > better to do this explicitly with forwarding, i.e. when a node has two > network interfaces, it bridges traffic between them. > Well, multicast is designed for it. So if multicast is an option next to broadcast you would get it for free. :) Just set the TTL to 1 by default if you want to be safe. Of course having a zeromq router to handle it could be an option. >> 3 - What I observed was nodes not knowing when a peer node was gone >> trying to connect to the peer node. This is obviously to no success. It >> also results in unnecessary traffic and handling. I would propose to >> embody a node state identifier inside the beacon. This way when a node >> exits it can send its exit state before terminating and other nodes have >> the possibility to pick this up. This would also be a welcome feature to >> inform nodes of each other state. For example when a node is overloaded >> it could inform about this using its state. A broker can leave him alone >> for a while. I think this could be useful in general. > > UDP broadcasts / multicasts are not reliable and are the first things > to be lost when the network is stressed (which is when you get client > disconnections). > > It would not be wise to try to use these for state propagation. > > Please consider zbeacon within the context of a full protocol such as > ZRE, which builds a TCP cluster on top of the zbeacon discovery. It > would be a shame to start mixing abstractions. > Agreed, state propagation is of course not essential. But wouldn't you agree broadcasting on exit would be better than determining a nodes state by trying to connect to it? Sending an exit message through the TCP handshake would possibly take too long when using lots of nodes? Broadcast on exit would work stress preventing. I've seen this a lot with OSPF in which detection of dead peers is taking too long and so traffic is dropped. >> - For passing multiple network segments multicast is often not possible. >> Crossing internet using multicast is nowhere available, to my knowledge. > > Indeed. I do not like using multicast for its ability to leak across > networks; that's so hard to get right and can lead to such a mess. > Much better IMO to see 1-segment UDP as one mechanism for discovery, > tied together with internetwork discovery over TCP (which would work > across the Internet). > > One step at a time. Could you think about not adding state to beacons > and instead looking at how ZRE does its interconnect. This will also > help you understand how to do other forms of discovery (e.g. an > application could simply tell a node, via the API, "here is a node at > hostname:port". > Will do :), I'm replying to your other mail in here as well: > Do you have a simple example case we could work through? Preferably > something real, not theoretical. Our use case is for an orchestration system. Currently we are seeing a lot of creative coding applications (PureData/MaxMSP/Blender/Isadora/VVVV/etc) Most inter-application communication is done using OSC protocols (UDP). This works great in a lot of cases however since this is often hardcoded into the applications it is not very flexible. We are currently looking into a newer approach in which we foresee a protocol that can exchange all the meta data of these systems (nodes) so they are easier orchestrated. We had a prototype that does almost the same as ZRE. One thing we really liked in the prototype was that when nodes would change state (ie. exit, which they do often) all other nodes would know instantly. So a video stream was stopped being send to a node immediately. This was done using multicast. For this protocol we are looking into keeping everything is as simple and efficient as possible. So we prefer sticking to how things work. multicast to send data to multiple nodes for example. Just because we know that when you need speed you'll want it to be done in hardware or as low level as possible. ZRE was fitting our use case perfectly for the discovery and meta data exchange. Applications still use OSC and other protocols to send the real data. That's a legacy we just have for now. I'm now building a simple system in which nodes would boot, discover each other, exchange capabilities, and be controlled dynamically. Rg, Arnaud -- w: http://www.sphaero.org t: http://twitter.com/sphaero g: http://github.com/sphaero i: freenode: sphaero_z25 _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
