On 05/16/2012 03:59 PM, Robert Collins wrote: > This is great news. > > So roughly 10m + 19m + 265m/workers. Neato. > > Testr will ignore the layers if you configure the new option we added; > it will still show a layer that fails to startup. This may help with > the idle time aspect. > > What do you think of us getting say 16cores, hyperthreaded, *one* > machine. That would make the action from idle snappier, at the cost of > either contention when there are serial landings, or complexity in > buildbot to say 'only run one test at a time, devel || db-devel'.
The serial approach is trivial. I've gotten it working and tested it, and it's fine. It turns out that BuildSlaves have a "max_builds" keyword argument available at instantiation. It defaults to None (infinite) and we can easily set it to 1. It would be less than five minutes of work in the data center. We'd probably want to expose this in the juju charm as well, for our own tests, but that would be very easy. I've hacked on it just a bit already. In contrast, the concurrent/contention approach is going to be at least somewhat expensive to deal with. I tried a trivial experiment with an immediate failure, and realized that we would have to deal with two separate underlying LXC containers for the ephemerals, because we build and update in the real container before switching to the ephemerals for the tests. That's certainly fixable, but will require time and work. Beyond that, I expect more challenges. If we go for the fully parallel case, I would bet money that we will have additional problems because of CPU contention. If you'd like further exploration of the concurrent approach, please ask, but I would be personally much happier to stick with the serial approach. Francis directed proceeding on the serial approach for now. >From a developer's perspective, the serial approach has interesting tradeoffs in comparison to a two machine approach. On the one hand, having to wait for a landing on devel would be somewhat more frequent with the serial approach, because a landing to devel will always be immediately followed with a landing to db-devel, which will block. If you had two machines, there would be no blockage. On the other hand, developers might get their changes pushed to db-devel slightly faster, because the 16 core machine should be about 10 minutes faster per run than an 8 core machine would. For developers, I'd argue the balance is in favor of the two-machine approach. However, if other planning factors mean that a single machine wins, the developer story is still way better than now. > > Relatedly, ec2land - do you think the HVM instance would be cost > effective for ec2land/ec2 test? It sounds like it has great results @ > 36 minutes, but perhaps just an 8 core is sufficiently good @ 51 > minutes? It's US $2.40 for the big machine versus US $1.80 for the eight core (versus 5 hours * US $.64/hour = US $3.20 today, I think). Both of those are cheaper, and I don't think an extra $.60 will break the bank for an additional 15 minutes of speed. OTOH, I suspect that getting the tests to run reliably on the 32 core machine won't be cost effective, in terms of developer time. I know that there are at least a couple of new bugs lurking there. > Ideally we'd have a super fast environment totally spun up and devs > could reuse it individually, reducing setup cost and really tuning > things; I think that is something for the next pass - definitely out > of scope for this project. (It probably needs to be in canonistack, it > probably needs N-machine scaling at that point, and other non-trivial > additional works. Diminishing returns will hit at some point, of course. Sounds interesting though. Gary -- Mailing list: https://launchpad.net/~yellow Post to : [email protected] Unsubscribe : https://launchpad.net/~yellow More help : https://help.launchpad.net/ListHelp

