I'm in no way an expert in this area. But from what I have seen the past years I think I can identify two repeating patterns:
1. Minor programming mistakes in unrelated code. This happens often when we add more strict types to existing code, or make it throw exceptions when it's called in a way it should never have been called. E.g. when a method that expects a string is called with null. Tests can rarely catch such "unthinkable" edge cases beforehand. They bubble up in production where codebases work together in ways that have never been part of any automated or manual sest setup. Luckily this kind of error is often easy to fix or safe to ignore. 2. Database hickups. Errors that appear to be "random" and are really hard, if not impossible to reproduce. Sometimes it turns out the reason is a really, really old database row that was created with very different constraints in mind. More recent code might have a different idea how a particular database table works nowadays and fails when faced with incompatible data. Or we find that the database schema on certain replication machines is not what it should be. For example foreign keys to tables that shouldn't exist any more since 18 years, but somehow still do. ;-) https://phabricator.wikimedia.org/T299387 Let's say I'm interested, but have no research at hand. :-) Best Thiemo _______________________________________________ Wikitech-l mailing list -- [email protected] To unsubscribe send an email to [email protected] https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
