Today, for those of us paying attention from the outside, the downtime
became impossible to monitor.  One problem is that the TTL on the DNS
server and maintenance records themselves is set much too short!

Maybe you never thought you'd be inaccessible for an hour?

Any chance master records could be moved to an anycast DNS provider?

The SOA is pretty reasonable:

wikimedia.org.          86400   IN      SOA     ns0.wikimedia.org.
hostmaster.wikimedia.org. 2011052410 43200 7200 1209600 600

But the actual records are all 1 hour, so they disappeared during
the downtime.  And ganglia, while it looks OK:

ns0.wikimedia.org.      3600    IN      A       208.80.152.130
ns1.wikimedia.org.      3600    IN      A       208.80.152.142
ns2.wikimedia.org.      3600    IN      A       91.198.174.4

secure.wikimedia.org.   3600    IN      A       208.80.152.134

ganglia.wikimedia.org.  3600    IN      CNAME   spence.wikimedia.org.
spence.wikimedia.org.   3600    IN      A       208.80.152.161

Had completely timed out in both my local cache and the Google servers
while I was looking at it, and wasn't able to contact a NS anywhere to
refresh.  Here's the last time I saw it:

ganglia.wikimedia.org.  473     IN      CNAME   spence.wikimedia.org.
spence.wikimedia.org.   473     IN      A       208.80.152.161

;; Query time: 2 msec
;; SERVER: 10.0.1.1#53(10.0.1.1)
;; WHEN: Tue May 24 10:06:13 2011

ganglia.wikimedia.org.  287     IN      CNAME   spence.wikimedia.org.
spence.wikimedia.org.   287     IN      A       208.80.152.161

;; Query time: 68 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue May 24 10:09:19 2011

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to