https://bugzilla.wikimedia.org/show_bug.cgi?id=66684

--- Comment #2 from Bryan Davis <[email protected]> ---
(In reply to Greg Grossmeier from comment #0)
> I am trying to figure out a way to catch these types of mistakes before we
> have outages in production.

The errors aren't really present in beta are they? The problem is that we have
a gap in code review/procedure that allows changes requiring database schema or
massive cache invalidation or similarly disruptive changes (which I think I've
heard called "scap traps" before) to be merged without producing some sort of
durable list of required actions that are needed to deploy the code in
production.

I've had similar problems everywhere I've worked where the size of the
development plus operations team was greater than one (and sometimes even when
I was working solo). The most easily automated solution I've seen in practice
was used at $DAYJOB-1. We used a tool developed in-house that could compare a
canonical schema which we kept in version control with the schema of any live
database. This tool would emit DDL changes to sync the database with the
canonical DDL. For local development and our integration environment these DDL
changes would be applied automatically by a script. In our staging and
production environments, the DDL alter script would be generated as part of the
build for the environment but then manually reviewed and applied by a DBA. The
major problem with this approach is scaling it as the deploy cycle accelerates
from once per week to once per day/hour/minute.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to