Hey Robert.  Francis mentioned that you had updated the parallel testing LEP so 
I took a moment to look at it today.

I cc'd the yellow squad to keep us all in the loop.  Hi everybody!  The LEP is 
https://dev.launchpad.net/LEP/ParallelTesting if you want to take a look.

Could you clarify these points, ideally on the LEP?

- You write that we must "[o]rganise and upgrade our CI test running to take 
advantage of this new environment."  You also clarify that "[c]hanging the 
landing technology is out of scope."  To make sure I understand, then, you want 
us to keep buildbot and everything else as-is as much as possible, but guide 
LOSAs to getting us machines/VMs that can quickly and robustly run these tests. 
 Is this right?  If so, no additional LEP clarification needed, I think, but 
otherwise, please give us more information there.

- You write in comments that "The prototype LXC + testr based parallelisation 
seems to have the best effort-reward tradeoff today."  [Yellow folks, I found 
https://dev.launchpad.net/ParallelTests to describe the prototype.]  Have you 
done enough research here that you are able to recommend or even prescribe this 
approach?  That would probably save time, if so; and though it violates my 
understanding of LEP goals to have an implementation prescribed, I think that 
ought to be relaxed for documents written by the TA.

- If we use LXC, do you expect this effort to dig into the fragility that you 
note in your prototype notes, and try to improve it?  If not, do you have 
requirements or thoughts on how to help developers work with the 
issues--perhaps scripts that developers are encouraged to use for the workflow, 
that handle problems like the ones you identify ("you may need to manually 
shutdown postgresql before stopping lxc, to get it to shutdown cleanly")?

- If we use LXC, you describe a number of steps to set up a working 
environment.  Do you envision a rocketfuel-XXX style script to help produce 
this environment?  If so, do you have any requirements for it?  If not, do you 
have something else in mind, and can we extract requirements from that?

If you don't intend to recommend/prescribe LXC + testr, these next two question 
are pertinent.

- You write that the solution "[m]ust parallelise more effectively than 
bin/test -j (which does per-layer splits)."  Is that really a "must"? If we met 
your success metric ("down to less than 50% of the current time, preferrable 
15%-20%"), would it really matter which method got there?  If it does matter, 
can you identify what the underlying "must" is for rejecting the -j approach, 
so that, for instance, other solutions can be cleanly rejected?

- Francis had said earlier when talking with me about the project that running 
the tests on multiple machines might be a acceptable way to achieve the goal.  
You specifically disallow that, even with the LEP title ("Single machine 
parallel testing of single branches"), even though doing this with multiple 
machines would match the letter of the law (the biggest stretch I see is that 
"[p]ermit[ting] developers to reliably run parallelised as well" would mean 
that developers would need to run ec2 to meet that requirement).  As with the 
previous question, is there a deeper "must" hidden in here somewhere?  Perhaps 
it is cost related?

That's all I've got so far. :-)

Thanks

Gary
-- 
Mailing list: https://launchpad.net/~yellow
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yellow
More help   : https://help.launchpad.net/ListHelp

Reply via email to