https://bugzilla.wikimedia.org/show_bug.cgi?id=66684
--- Comment #2 from Bryan Davis <[email protected]> --- (In reply to Greg Grossmeier from comment #0) > I am trying to figure out a way to catch these types of mistakes before we > have outages in production. The errors aren't really present in beta are they? The problem is that we have a gap in code review/procedure that allows changes requiring database schema or massive cache invalidation or similarly disruptive changes (which I think I've heard called "scap traps" before) to be merged without producing some sort of durable list of required actions that are needed to deploy the code in production. I've had similar problems everywhere I've worked where the size of the development plus operations team was greater than one (and sometimes even when I was working solo). The most easily automated solution I've seen in practice was used at $DAYJOB-1. We used a tool developed in-house that could compare a canonical schema which we kept in version control with the schema of any live database. This tool would emit DDL changes to sync the database with the canonical DDL. For local development and our integration environment these DDL changes would be applied automatically by a script. In our staging and production environments, the DDL alter script would be generated as part of the build for the environment but then manually reviewed and applied by a DBA. The major problem with this approach is scaling it as the deploy cycle accelerates from once per week to once per day/hour/minute. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
