Re: [zeromq-dev] Inter thread communication for scalability

Kenneth Adam Miller Tue, 14 Jan 2014 13:25:33 -0800

I kind of think it would be the message passing, because obviously, if a
thread tries to acquire a clean container from the set of shared, and the
shared queue is empty, then it has to wait. I would much rather it receive
a signal for it to wake up, but maybe boost::lockfree has an answer to this
too...



On Tue, Jan 14, 2014 at 3:19 PM, Kenneth Adam Miller <
[email protected]> wrote:

> Actually, which do you think would result in a better design decision?
> Would using message passing result in a more scalable architecture? Where I
> could just change a variable to increase throughput on better processors.
>
>
> On Tue, Jan 14, 2014 at 3:09 PM, Kenneth Adam Miller <
> [email protected]> wrote:
>
>> Well, I'm just a type safety dork, so I tend to think that even losing
>> type information over one pointer, even if I know where that pointer is
>> going to end up what type it represents on the other side, is a bad thing.
>> Plus it's a performance thing that's just unnecessary, but I don't think
>> it's a big deal. These aren't objects, they are indeed raw buffers, as you
>> assumed.
>>
>> Also, awesome about the boost find! Appreciate you so much, you are a
>> beast. But I'm actually still in a sprint, so there's no version or commit
>> with which these hypothetical discussions directly coincide, you're helping
>> me get it right the first time.
>>
>>
>> On Tue, Jan 14, 2014 at 2:48 PM, Lindley French <[email protected]>wrote:
>>
>>> A visit to the Boost libraries reveals there's a brand-new
>>> Boost.Lockfree library that must have arrived with one of the last few
>>> versions. You should seriously consider simply replacing your std::lists
>>> with boost::lockfree::queues using your existing logic, and see if that
>>> gives you the performance you're looking for before you make any massive
>>> changes.
>>>
>>>
>>> On Tue, Jan 14, 2014 at 3:40 PM, Lindley French <[email protected]>wrote:
>>>
>>>> I'm going to caution you about passing pointers through inproc. It may
>>>> be possible to do safely, but I haven't yet figured out how to manage
>>>> ownership semantics in an environment where messages (pointers) can be
>>>> silently dropped.
>>>>
>>>> I didn't imagine serialization would be a problem since you referred to
>>>> "buffers"; I thought these would be raw byte buffers. If you actually mean
>>>> lists of objects, then yes, you'll need to serialize to use inproc. There
>>>> are a number of options for serialization in C++; there's
>>>> Boost.Serialization, Google Protobufs, a few others. You can also do it
>>>> manually if your objects are simple.
>>>>
>>>> Qt Signals & Slots is another solution for inter-thread communication
>>>> similar to inproc which has the expected C++ object semantics and therefore
>>>> doesn't require serialization. The downside is it's really only useful for
>>>> one-to-one, one-to-many, or many-to-one semantics. This covers a lot, but I
>>>> don't think it has a way to cover one-to-any, which is really what you want
>>>> (and what the zmq push socket is designed for).
>>>>
>>>>
>>>> On Tue, Jan 14, 2014 at 2:42 PM, Kenneth Adam Miller <
>>>> [email protected]> wrote:
>>>>
>>>>> Yeah, it's in C/++.
>>>>>
>>>>>
>>>>> On Tue, Jan 14, 2014 at 1:39 PM, Charles Remes 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> If you are doing this from C and can access the raw memory, an inproc
>>>>>> socket can pass pointers around. If you are using a managed language or 
>>>>>> one
>>>>>> where accessing raw memory is difficult, you’ll want to figure out how to
>>>>>> “fake” passing a pointer (or an object reference). In your case it seems
>>>>>> like serializing/deserializing would be a big performance hit. That said,
>>>>>> if that is the direction you must go then pick something fast like 
>>>>>> msgpack
>>>>>> as your serializer.
>>>>>>
>>>>>>
>>>>>> On Jan 14, 2014, at 1:29 PM, Kenneth Adam Miller <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>> @AJ No, but I understand exactly why you suggested that. It's because
>>>>>> I haven't explained that thread 1 is doing critical work and it needs to
>>>>>> offload tasks to other threads as quickly as possible.
>>>>>>
>>>>>> @Lindley, Thanks so much for helping me see the truth! I was getting
>>>>>> awful confused considering all the different bolony that could go on if I
>>>>>> was stuck with semaphores, and I couldn't really re-envision it. Is there
>>>>>> any kind of convenience function or core utility for de-serializing the
>>>>>> data you receive over inproc messages?
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 14, 2014 at 12:49 PM, AJ Lewis <[email protected]>wrote:
>>>>>>
>>>>>>> In the zeromq example, couldn't you just skip thread 1 entirely?
>>>>>>>  Then the
>>>>>>> PULL socket from thread 2 takes uncompressed input from the source,
>>>>>>> compresses it, and shoves it out the PUSH socket to thread 3 for
>>>>>>> output.
>>>>>>>
>>>>>>> In this case, the PULL socket is the uncompressed pool and the PUSH
>>>>>>> socket
>>>>>>> is the compressed pool.  Just make sure your uncompressed pool
>>>>>>> doesn't fill
>>>>>>> up faster than thread 2 can compress it, or you'll need to implement
>>>>>>> some
>>>>>>> logic to prevent it from using up all the memory.
>>>>>>>
>>>>>>> AJ
>>>>>>>
>>>>>>> On Tue, Jan 14, 2014 at 01:16:32PM -0500, Lindley French wrote:
>>>>>>> > In this case your "buffers" are really just messages, aren't they?
>>>>>>> A thread
>>>>>>> > grabs one (receives a message), processes it, and writes the
>>>>>>> result into
>>>>>>> > another buffer (sends a message).
>>>>>>> >
>>>>>>> > The hard part is that ZeroMQ sockets don't like to be touched by
>>>>>>> multiple
>>>>>>> > threads, which complicates the many-to-many pattern you have going
>>>>>>> here.
>>>>>>> > I'm no expert, but I would suggest....
>>>>>>> >
>>>>>>> > Each "pool", A and B, becomes a single thread with two ZMQ inproc
>>>>>>> sockets,
>>>>>>> > one push and one pull. These are both bound to well-known
>>>>>>> endpoints. All
>>>>>>> > the thread does is continually shove messages from the pull socket
>>>>>>> to the
>>>>>>> > push socket.
>>>>>>> >
>>>>>>> > Each thread in "Thread set 1" has a push inproc socket connected
>>>>>>> to pool
>>>>>>> > A's pull socket.
>>>>>>> >
>>>>>>> > Each thread in "Thread set 2" has a pull inproc socket connected
>>>>>>> to pool
>>>>>>> > A's push socket and a push inproc socket connected to pool B's
>>>>>>> pull socket.
>>>>>>> > For each message it receives, it just processes it and spits it
>>>>>>> out the
>>>>>>> > other socket.
>>>>>>> >
>>>>>>> > The thread in "Thread set 3" has a pull inproc socket connected to
>>>>>>> pool B's
>>>>>>> > push socket. It just continually receives messages and outputs
>>>>>>> them.
>>>>>>> >
>>>>>>> > This may seem complicated because concepts that were distinct
>>>>>>> before
>>>>>>> > (buffer pools and worker threads) are now the same thing: they're
>>>>>>> both just
>>>>>>> > threads with sockets. The critical difference is that the "buffer
>>>>>>> pools"
>>>>>>> > bind to well-known endpoints, so you can only have a few of them,
>>>>>>> while the
>>>>>>> > worker threads connect to those well-known endpoints, so you can
>>>>>>> have as
>>>>>>> > many as you like.
>>>>>>> >
>>>>>>> > Will this perform as well as your current code? I don't know.
>>>>>>> Profile it
>>>>>>> > and find out.
>>>>>>> >
>>>>>>> >
>>>>>>> > On Tue, Jan 14, 2014 at 12:23 PM, Kenneth Adam Miller <
>>>>>>> > [email protected]> wrote:
>>>>>>> >
>>>>>>> > > So, I have two pools of shared buffers; pool A, which is a set
>>>>>>> of buffers
>>>>>>> > > of uncompressed data, and pool B, for compressed data. I three
>>>>>>> sets of
>>>>>>> > > threads.
>>>>>>> > >
>>>>>>> > > Thread set 1 pulls from pool A, and fills buffers it receives
>>>>>>> from pool A
>>>>>>> > > up with uncompressed data.
>>>>>>> > >
>>>>>>> > > Thread set 2 is given a pool from A that has recently been
>>>>>>> filled. It
>>>>>>> > > pulls a buffer from pool B, compresses from A into B, and then
>>>>>>> returns the
>>>>>>> > > buffer it was given, cleared, back to pool A.
>>>>>>> > >
>>>>>>> > > Thread set 3 is a single thread, that is continually handed
>>>>>>> compressed
>>>>>>> > > data from thread set 2, which it outputs. When data is finished
>>>>>>> output, it
>>>>>>> > > returns the buffer to pool B, cleared.
>>>>>>> > >
>>>>>>> > > Can anybody describe a scheme to me that will allow thread sets
>>>>>>> 1 & 2 to
>>>>>>> > > scale?
>>>>>>> > >
>>>>>>> > > Also, suppose for pools A and B, I'm using shared queues that
>>>>>>> are just C++
>>>>>>> > > stl lists. When I pop from the front, I use a lock for removal
>>>>>>> to make sure
>>>>>>> > > that removal is deterministic. When I enqueue, I use a separate
>>>>>>> lock to
>>>>>>> > > ensure that the internals of the STL list is respected (don't
>>>>>>> want two
>>>>>>> > > threads receiving iterators to the same beginning node, that
>>>>>>> would probably
>>>>>>> > > corrupt the container or cause data loss, or both). Is this the
>>>>>>> appropriate
>>>>>>> > > way to go about it? Thread sets 1 & 2 will likely have more than
>>>>>>> one
>>>>>>> > > thread, but there's no guarantee that thread sets 1 & 2 will
>>>>>>> have equal
>>>>>>> > > threads.
>>>>>>> > >
>>>>>>> > > I was reading the ZeroMQ manual, and I read the part about
>>>>>>> multi-threading
>>>>>>> > > and message passing, and I was wondering what approaches should
>>>>>>> be taken
>>>>>>> > > with message passing when data is inherently shared between
>>>>>>> threads.
>>>>>>> > >
>>>>>>> > > _______________________________________________
>>>>>>> > > zeromq-dev mailing list
>>>>>>> > > [email protected]
>>>>>>> > > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>> > >
>>>>>>> > >
>>>>>>>
>>>>>>> > _______________________________________________
>>>>>>> > zeromq-dev mailing list
>>>>>>> > [email protected]
>>>>>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> AJ Lewis
>>>>>>> Software Engineer
>>>>>>> Quantum Corporation
>>>>>>>
>>>>>>> Work:    651 688-4346
>>>>>>> email:   [email protected]
>>>>>>>
>>>>>>>
>>>>>>> ----------------------------------------------------------------------
>>>>>>> The information contained in this transmission may be confidential.
>>>>>>> Any disclosure, copying, or further distribution of confidential
>>>>>>> information is not permitted unless such privilege is explicitly 
>>>>>>> granted in
>>>>>>> writing by Quantum. Quantum reserves the right to have electronic
>>>>>>> communications, including email and attachments, sent across its 
>>>>>>> networks
>>>>>>> filtered through anti virus and spam software programs and retain such
>>>>>>> messages in order to comply with applicable data security and retention
>>>>>>> requirements. Quantum is not responsible for the proper and complete
>>>>>>> transmission of the substance of this communication or for any delay in 
>>>>>>> its
>>>>>>> receipt.
>>>>>>> _______________________________________________
>>>>>>> zeromq-dev mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> [email protected]
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> zeromq-dev mailing list
>>>>>> [email protected]
>>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> [email protected]
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> [email protected]
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>
>

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Inter thread communication for scalability

Reply via email to