Hello all, <quote name="Ori Livneh" date="2013-04-23" time="15:23:49 -0700"> > On Tuesday, April 23, 2013 at 1:06 PM, Greg Grossmeier wrote: > > > > On the [[How to deploy code]] wikitech page, there is a section on > > Testing your live code: > > https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Test_your_code_live > > > > That's a pretty basic overview of it and it could be greatly improved > > with information like: > > * How to monitor specific parts of the cluster that are relevant to what > > you deployed > > * What general monitoring should be looked at after you deploy > > > MediaWiki exceptions / fatals are plotted in Ganglia now, though somewhat > awkwardly under node vanadium.eqiad.wmnet (where they're getting tallied) > rather than the node on which the error originated. I think the way it's done > now deserves another thought (maybe this ought to go in graphite, instead?), > but at the same time it is sufficiently intelligible to be of _some_ use, I > think. > > The most useful view is the last two hour's worth of exceptions and misc. > fatals (evergreen link): > > http://ganglia.wikimedia.org/latest/graph.php?r=2hr&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&x=0.5&n=&hreg[]=vanadium.eqiad.wmnet&mreg[]=fatal%7Cexception>ype=stack&glegend=show&aggregate=1&embed=1 > > (The m is 'mili', so the current peaks correspond to one exception / fatal > every 6-10 seconds.) > > I'll add it to the post-deployment instructions if people find it useful.
Ori added that, and I believe S Page added some more info to that section. https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Test_and_monitor_your_live_code How does it look? Anyone here have any corrections and/or additions that aren't represented there yet? Thanks, Greg -- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @greg A18D 1138 8E47 FAC8 1C7D | _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
