On Thu, Nov 24, 2011 at 10:59 AM, Benji York <[email protected]> wrote: >> Our test distribution per layer is not very even - I highly doubt that >> we'd be able to meet a reduction to 15% of the current time splitting >> per layer. > > Let's look at the test distribution: The last buildbot run took 360 > minutes. There were 4 layers that took longer than 11 minutes to run: > 55, 56, 65, and 99 minutes. All the other layers add up to about > 60 minutes.
So the shortest run -j could give is 99 minutes, or 27% runtime. I don't see how you can bisect a layer, unless you mean 'create a fake layer extending it and manually allocate 50% of the tests to it'. That seems like a non-starter to me - way to much maintenance overhead. > If we bisect the four largest layers (to make it so the test runner's > blind layer scheduling can't bite us too hard) and assume that running 4 > layers simultaneously imposes no more than a 50% overhead, then we would > be right at 40% of the current running time. > > Reasoning sidebar: 99 is the length in minutes of the longest layer; it > was bisected, but even then its other half is still the longest > remaining layer so for pessimism's sake we assume they get run one after > another. All the other layers would be finished by then, so that gives > us 99*1.50/360 = .41. > > Even if we assume no parallelization overhead, per-test scheduling (as > opposed to per-layer as above) and four-way parallelization, we'll still > be at 25% of the original time, so I'm interested in ideas as to how we > might achieve a reduction to 15% of the original time. If local parallelisation will work, testr run --parallel will load balance all the tests optimally based on previous performance - a single run from e.g. ec2 can tell us which tests are slow and let it decide from there. >> The other issue of shared global state that will bite us, >> will also be a significant issue with -j, unless a remoting facility >> is brought in (and at that point it seems to be reinventing >> subunit.... :P). > > This is the real catch. If the tests haven't been written to be > parallelizable (which LP's certainly have not), then global state > collisions accumulated over years of assuming non-parallel tests could > be hard to fix. On the other hand, if fixing them turns out to be easy, > then using the test runner's built-in parallelization (-j) would be the > most bang for the buck. bin/test --parallel already exists and does better splitting than -j, so I disagree that -j would be the best approach, *if* the collisions etc are easy to fix :). -Rob -- Mailing list: https://launchpad.net/~yellow Post to : [email protected] Unsubscribe : https://launchpad.net/~yellow More help : https://help.launchpad.net/ListHelp

