On Jul 13, 2010, at 9:06 AM, Matt Weinstein wrote: > (( Sorry, I was typing on an iPad. In a moving car. Heading for a > train. :-) )) > > Originally, I figured it would be easy to just use an XREP directly, > and interpret and dispatch, but that's no fun. > > Another way to do it might be to cheat, and use two XREPs back to > back, i.e. SERVICE THREADS emit a REQUEST when they start up. This > identifies them to your specially crafted DEVICE, which places them in > a FIFO queue. > > When a real CLIENT REQUEST comes in from the other side, you pull the > next available SERVICE THREAD uuid off the FIFO, and feed the CLIENT > REQUEST to the SERVICE THREAD (as a response!). Now you know how is
who is > handling the request, this information can be treasured away. > > When the response comes back from the SERVICE THREAD, the response is > forwarded to the CLIENT, and the SERVICE THREAD's UUID is put back in > the FIFO. > > If there is a TIMEOUT (use the 3rd socket approach, per prior email > threads) another SERVICE THREAD is pulled off the FIFO, and the CLIENT > REQUEST is re-sent to the new SERVICE THREAD. State is recorded to > make sure the request is only responded to once. > > You can only poll on the inbound socket (CLIENT REQUEST side) if you > actually have SERVICE THREADS in the FIFO. Fortunately, you are > single threaded, so you will hear about that from the other socket > shortly :-) > > If you need some type of buffering, you can add another buffering > DEVICE (ZMQ_QUEUE) cross connected with your special device and two > ZMQ_STREAMs. > > As an added plus, I believe the request UIDs can be assigned at this > layer, so you don't even need to track them at the CLIENT side. > Actually, the request id is probably == the CLIENT uuid. As another bonus, you now have REQUEST service times available, etc., which can be used to spice up the load balancing, thread pool sizing, etc. > I believe this maintains all of the socket state correctly, but this > should be walked through carefully. > > That's the sketch anyway. > > If this works, I'll probably use it too :-) > > Comments welcome ... > > Best, > > Matt > > > On Jul 13, 2010, at 7:28 AM, Brian Granger wrote: > >> On Tue, Jul 13, 2010 at 3:39 AM, Matt Weinstein >> <[email protected]> wrote: >>> The XREQ / XREP protocol carries the GUID of the requester in the >>> header >>> part. >>> >>> client --> [ REQ ] --- [ XREP ] === (GUID of client available >>> here) == [ >>> XREQ ] [ REP ] your workers here >>> >>> I'm using it to track timeouts. >>> >>> Does that help? >> >> Not really. We need the ability to track the location of each >> message >> in the worker pool. The GUIDs track which client submitted the >> message. >> >> Brian >> >>> On Jul 13, 2010, at 3:41 AM, Brian Granger wrote: >>> >>>> Hi, >>>> >>>> One of the core parts of a system I am building with zeromq is a >>>> task >>>> distribution system. The system has 1 master process and N >>>> workers. >>>> I would like to use XREP/XREQ sockets for this, but am running into >>>> two problems: >>>> >>>> 1. The load balancing algorithm (round robin) of the XREQ socket >>>> is >>>> not efficient for certain types of loads. I think the recent >>>> discussions on other load balancing algorithms could help in this >>>> regard. >>>> >>>> 2. Fault tolerance. If a worker goes down, I need to know about >>>> it >>>> and be able to requeue any tasks that went down with the worker. >>>> To >>>> monitor the health of workers, we have implemented a heartbeat >>>> mechanism that works great. BUT, with the current interface, I >>>> don't >>>> have any way of discovering which tasks were on the worker that >>>> went >>>> down. This is because the routing information (which client gets a >>>> message) is not exposed in the API. >>>> >>>> Because of this second limitation, we are having to look at crazy >>>> things like using a PAIR socket for each worker. This is quite a >>>> pain >>>> because I can't use the 0MQ load balancing and I have to manage >>>> all of >>>> those PAIR sockets (1 for each worker). With lots of workers this >>>> doesn't scale very well. >>>> >>>> Any thought on how to make XREQ/XREP more fault tolerant? >>>> >>>> Cheers, >>>> >>>> Brian >>>> >>>> -- >>>> Brian E. Granger, Ph.D. >>>> Assistant Professor of Physics >>>> Cal Poly State University, San Luis Obispo >>>> [email protected] >>>> [email protected] >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> >>> >> >> >> >> -- >> Brian E. Granger, Ph.D. >> Assistant Professor of Physics >> Cal Poly State University, San Luis Obispo >> [email protected] >> [email protected] > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
