2009/11/4 kan <[email protected]> > Thanks~ > > I hear that OpenMP is based on Shared Memory Parallel Computing, and it > needs random accessing to the data (I am not very sure, but just hear about > that, if I am wrong, please forgive me and tell me the right thing, > thanks). For my code, the contacts are stored in a list, that is not > randomly accessable. If it is ture OpenMP needs random accessing, then how > does YADE do that?Is it ok to run YADE on a distributed memory computer > architecture, like in Cluster--I saw somebody run it on a cluster, but I do > not know the cluster architecture. Because in a distributed memory > architechture, the memory is not directlly accessable, which will increase > the programming difficulty. > > Currently my speed is 3~4 second to do a iteration for about 100,000 > particles, and 33 seconds to do an iterations for 988,000 particles. > I use threads to do the parallel computing, but the speed up is only 2.2 > times faster (15 seconds for 988,000 particles.). This speed is terrible > bad, because the dt is in the order of 10^(-7) second for > oh, this speed up is based on 5 theads on a 2X4 cores computer.
> the simulation of rock particles. > > Thanks > > Yongfeng > > 2009/11/4 Václav Šmilauer <[email protected]> > > >> >> > What is the parallel structure used in YADE? I remember in one mail >> > (from email-list), it is said that YADE does not use the domain >> > decomposition method, then what is the parallel method? >> >> Yade parallelizes loops over bodies and interactions using openMP; see >> notably >> >> http://bazaar.launchpad.net/%7Eyade-dev/yade/trunk/annotate/head% >> 3A/pkg/common/Engine/MetaEngine/InteractionDispatchers.cpp#L40 >> >> http://bazaar.launchpad.net/%7Eyade-dev/yade/trunk/annotate/head% >> 3A/pkg/dem/Engine/StandAloneEngine/NewtonsDampedLaw.cpp#L57 >> >> Speedup depends very much on the computer architecture, RAM speed, cache >> size etc (openmp is shared-memory parallelization). I have speedup over >> 3x on 4 cores (i7 with ddr3 ram) and recently I had 5.78 on a 2x4core >> Xeon X5570. >> >> I don't know of anyone running on something larger than 8 cores; it >> might scale further, especially for large simulation, where the openMP >> overhead and the non-parallel portions of computation (collider, for >> instance) don't play large role. >> >> (If you have 2 engines that don't touch the same data and are >> independent, there is ParallelEngine for that, but I don't know of any >> case where it really pays off; maybe the coupling problems could benefit >> from that) >> >> Cheers, Vaclav >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~yade-users >> Post to : [email protected] >> Unsubscribe : https://launchpad.net/~yade-users >> More help : https://help.launchpad.net/ListHelp >> > >
_______________________________________________ Mailing list: https://launchpad.net/~yade-users Post to : [email protected] Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp

