It was superb concert and now back to business. :-)

On 9.4.2016 12:33, Pieter Hintjens wrote:
Too many questions at once :)
*teasing on*
Said by a person who writes thick books in its own free time. :-)
*teasing off*
- Async request-reply already works, using service requests, and
mailboxes for replies.
That's nice to hear. I noticed various interfaces, but I concentrated on service part only.
I'll check that.
- For reliability, there is already a "tracker" field in messages. My
intention was/is that recipients can send CONFIRMs back for specific
messages. These flow asynchronously, and allow senders to retry as
needed. See mlm_proto.xml.
I noticed that, but obviously I was not reading right comments. ;-) It seems to be the right stuff. :-)

What were conceptual reasons to make tracker as string? What is wrong with some integer value (e.g. uint32/64)? This way we skip string space allocate/deallocate pairs on each stage of transfer. It is anyhow internal structure.

Any conceptual objection to extend the confirm message with a (possibly zero length) blob?
<message name = "CONFIRM">
        Client confirms reception of a message, or server forwards this
confirmation to original sender. If status code is 300 or higher, this
        indicates that the message could not be delivered.
        <field name = "tracker" type = "string">Message tracker</field>
        <field name = "status" type = "status" />
        <field name = "payload" type = "chunk" />
</message>
- If you need additional reliability layers I'd suggest writing this
in an application on top of Malamute rather than in the
protocol/server/client itself. Since you can embed mlm_server as a
thread, and talk to the broker over inproc:, you can embed routing
applications in the same process.
That is possible, but what about simple zsock_attach with several connect addresses? Just simple change from connect to attach. ;-) Each new thread is potential can of worms. I prefer not to use them unless absolutely necessary. ;-)

Is there a way to influence the selection of next server/router on DEALER? As far I know it is round robin fashion only. Perhaps some callback function on zsock_t which returns id of the next socket to use?
- As always, try to make small, incremental changes that can be proven
independently. Be wary, even paranoid, of your own ability to make
large-scale architectures. If you want to make large experiments, you
can certainly do this, yet it will always be hard to bring such
changes back into master.
Of course. :-)

-Pieter

On Sat, Apr 9, 2016 at 1:57 AM, Matjaž Ostroveršnik
<[email protected]> wrote:
Hi all,

In last days I am studying the mlm server architecture. Nice and clean
design. Congrats to the architects.

My objective is set to prepare adaptable net of nodes with client and worker
functionality (i.e. service pattern in mlm semantics) that
can communicate among themselves. It is always client (consumer of the
results of the service)-worker(provider of the service) communication, but
nodes have both client and worker functionality.

There is/are one or more brokers between clients and workers. Communication
goes client-broker-worker-broker-client. Brokers should be as simple as
possible.

A. Broker:

There are two or more brokers in the net.
Brokers hold no vital information. If one broker goes down (regular /
irregular shutdown) net must readapt. No payload is lost.
broker is go between clients and workers from the security reasons (clients
must not have direct access to the workers and vice versa)

brokers allow only connection of registered clients/workers

brokers distribute work between workers offering the same service (load
balancing)

e.g. round-robin

provide alternative paths between clients and workers.  (resistance to
failure)

C1->B1->W1->B1-C1 if there are several brokers the same pair C1/W1 can
communicate via different brokers, assuming that all  workers and all
clients are connected to all brokers.

source of information about other brokers in the net

i.e. client/worker connects to one predefined broker, periodically it
distributes the list of other brokers to its workers/clients
brokers have periodically "quorum" when they exchange information about them
selves (i.e. list of all active brokers)
initially each node is configured with at least one broker (others are
provided dynamically)
initially each broker is configured with at least one peer broker (is there
mechanism to make broadcast on WAN?)

B. Clients:

connected to one broker initially, gradually they "see" whole set of brokers
if one broker fails, client uses different path (via retransmitting the
message)
clients distribute work between brokers

C. Workers:

- each worker provides one service type
- there can be several workers offering the same service

D. Messages:

payload is opaque (size cca 4KB)
meta-data : sender, receiver, unique-msg id, possible duplicate, original
message id (for replys), retry cnt
if they are expired and until there is an option for retry client retries
them

E. Message exchange:

it is always initiated by the client part of the node and goes to the
service worker, which replys
if message is not return in predefined period of time (msg exchange timeout)
it is retried (predefined number of retries)
client can select the broker to send the message through, but worker always
returns message via arriving path(broker)
message exchange C-B-W-B-C is fast (sub seccond)

Current mlm server is by my opinion platform for this task. Am I too
optimistic?
Existing functionality:
   - A.1-2
   - A.4 (?) believe so
   - C.*
   - E.1 (how to do reply?)
   - E.3-4
Already added functionality:
   - A.3.1 (curve)
   - D.1  (blobs)
TODO functionality:
   - A.5
   - A.6 (the toughest stuff)
   - B.1-3
   - D.2-3
   - E.2

Questions:

Async request-reply pattern. Is it already supported? Code talks about
replys, but I am unsure. Hints?
D.2-3, E.2: Is current message (mlm_msg_t) suitable for expansion (e.g.
unique-id, original-unique-id, retry-cnt)?Or should I do it in some other
way?
A.5: clients / workers should connect to different brokers (e.g. array of
mlm_client actors )
A.6: Currently I do not have solution for A.6. Is zgossip the right
solution?
B.1-3 - Strictly related to A.6 solution

What is your opinion on this?
Would you do something in a different way?
How you would tackle A.6?

Thanks in advance

Matjaž


_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to