Question #215540 on Yade changed:
https://answers.launchpad.net/yade/+question/215540
Chareyre posted a new comment:
Thanks.
You could actually plot everything on the same graph more easily if y-axis was
cycles*Nparticles / time.
A conclusion from these results seems to be that parallelism gives a 3x
speedup, obtained with 3-4 cores, and there is no
point using more than 4 cores.
This is not really what I concluded from my recent tests, but again: different
simulation => different conclusions.
A few things to keep in mind:
- The collider (contact detection) is the main non-parallel task.
- the collider takes a larger part of the total time for larger number of
particles, and for more dynamic simulations
- BUT it takes less time if verletDist is increased, at the price of more
virtual interactions
In my recent tests, the collider was taking about 1% of the total time
(*), then it did not matter if the collider is parallel or not. If the
collider takes more than that, then it can explain why you get the best
speed with 3-4 cores when I get it with 8 cores.
In "--performance", the collider's cost goes from 1.8% (5k bodies) to
55% (200k bodies). This is partly because, the stats there include the
cost of initializing the collider (cost of the first iteration in any
simulation). Including this cost is not really correct: since the number
of steps is varying as a function of Nparticles, the 1st iteration will
take proportionaly more time with more particles but this is only
because the total number of iterations is smaller, then the result can't
be extrapolated in the form of an average time per step.
In the end, there is a clear answer to your question: no, --performance
is not good at testing parallelism and/or hardware.
(*) This information is available in the "--performance" output, 2nd
line in the table below. If you are currently running tests, it would be
good to record such data as it gives a better understanding of how/why
speed is affected by the different factors.
Name Count
Time Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter 12000 369078us
0.39%
InsertionSortCollider 337 1713474us
1.80%
InteractionLoop 12000 74036435us
77.56%
NewtonIntegrator 12000 19331902us
20.25%
TOTAL 95450891us
100.00%
25091
--
You received this question notification because you are a member of
yade-users, which is an answer contact for Yade.
_______________________________________________
Mailing list: https://launchpad.net/~yade-users
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yade-users
More help : https://help.launchpad.net/ListHelp