Claudio, One of the examples: to solve the equation 8000 times in serial mode takes 5976036 microseconds. To solve 8000 equations in parallel mode takes 1421224 microseconds. I distribute the job for each threads equally, the speedup is good, ~4.
If I solve the same equation 800 times, the numbers are 622463 micros and 202536micros, the speedup decreased to ~3. 80 times: 64614micros and 134429micros, the serial is already faster. Going down to 8: 6345 and 328286... I used zmq_stopwatch_start/stop. Context setup and sockets setup are not included, I create them before launching the parallel part so they do not eat time. The very same code runs in serial mode and in a thread. It is a direct equation solving from LAPACK (dgbtrf). In parallel mode the code is just creates a C++ instance, gets the data from a pool so there is no even memory allocation. I send the pointer to a thread which already in a pool and waiting for a message. When it comes, runs the solver based on the pointer. To me this thing reminds for the inproc_lat tests when we have low roundtrip_count with the high latency but I am not sure, I have no idea what can be the reason. The time for sending/receiving a message is about 1 microsec or less I think if I send large number if messages, I tested that. Danny On Sun, Jan 20, 2013 at 7:25 PM, Claudio Carbone <[email protected]> wrote: > Hi Dan. > > What is the time it takes to solve a single equation? > > it all depends on how you coded your app: there is a fixed calculation > that pertains only to the multi threaded version, fixed means that it > doesn't scale. > So when the iterations are so many, your fixed amount of time doesn't > affect the total length; when your iterations are few, this time surpasses > and overtakes the actual calculation time. > > Have you measured how long it takes to get all the zmq context and sockets > setup? How much it takes to get ready to churn numbers? > > Maybe it's that part that comes down as the most relevant when you process > few equations. > > Claudio > -- Sent from my ParanoidAndroid Galaxy Nexus with K-9 Mail. > > dan smith <[email protected]> wrote: >> >> >> Dear All, >> >> I have got a multi threading application. It uses a pool of threads. Each >> thread in the pool communicates with the main thread via ZMQ_PAIR sockets, >> one pair for each thread. The length of the messages is 8 bytes (pointer to >> a C++ object). On a quad core machine I use 8 threads. Each thread in >> the pool solves a linear equation system (a small one, still the solution >> time is much larger than the time needed for the communication between the >> main thread and a worker). >> >> The speedup is perfect...if large number of equations are solved. >> However, the requirement is to solve just 8 equations parallel, one in each >> thread at the same time. If I decrease the number of equations below ~100, >> the serial solution becomes much faster than the multi core solution. >> >> Why is that and how this problem can be solved? How can a multi core >> application be made efficient for small problems too? >> >> Thank you very much in advance, >> >> Danny >> >> >> ------------------------------ >> >> zeromq-dev mailing list >> [email protected] >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > >
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
