<quote name="Jon Robson" date="2014-03-07" time="09:30:09 -0800">
> Let's also take this into a new thread. There are a lot of different
> conversations now going on....


My opinion is that fixing this with policy is going to be hard.

Either everyone who commits needs to be mindful of what day/time it is
and whether or not another human has cut the new branch yet (which isn't
set in stone on when, it varies by a couple hours, depending on a lot of
factors), OR we modify the branch cut based on some arbitrary offset
(24 hours ago) or some human looks at the merges and picks a point.

None of those are ideal/scalable.

What we should do, however, is have a true "deployment pipeline".
Briefly defined: A deployment pipeline is a sequence of events that
increase your confidence in the quality of any particular build/commit
point.

A typical example is:
commit -> unit tests -> integration tests -> manual tests -> release


Each step has the ability to fail a build, which means "You shall not
pass!" to that commit point. The earlier you get a "You shall not pass!"
the better because it means less time waiting by the developers to know
if what they committed is ok or not.


What this means for us:
The Mobile team is actually a good example. They are doing The Right
Thing and have a lot of tests written, including browser tests. They run
into problems when, eg: they write a new feature and associated test and
commit it.

Beta Cluster gets that code (feature and test) within 5 minutes.

But, test.wikipedia and en.wikipedia get that feature much later, days
later.

However, the test code is run by Jenkins across all environments (beta
cluster, test.wikipedia, en.wikipedia etc) all the time. So, the mobile
team gets a ton of false positives when their new test runs against eg
production where the feature isn't enabled yet (on purpose).

The QA team is working on this problem now (loosely termed the
"versioned test problem").


How a pipeline would help:

Really, a pipeline isn't a thing like your indoor plumping but more of a
mindset/way of designing your test infrastructure. But, it means that
you keep things self-contained (contrary to the mobile example above)
and things progress through the pipeline in a predicable way/pace.

It also means that each code commit spends the exact same amount of time
in the various stages as other code commits. Right now some code sits on
Beta Cluster for 7 days before hitting production, whereas other code
spends 0-5 minutes. That's not good.

Wanna help us on this problem? We're hiring:
https://hire.jobvite.com/Jobvite/jobvite.aspx?b=nHZ0zmw6 
(2 job openings)


Greg


-- 
| Greg Grossmeier            GPG: B2FA 27B1 F7EB D327 6B8E |
| identi.ca: @greg                A18D 1138 8E47 FAC8 1C7D |

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to