https://bugzilla.wikimedia.org/show_bug.cgi?id=37536
Merlijn van Deen <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #1 from Merlijn van Deen <[email protected]> 2012-06-13 21:53:59 UTC --- Simple test script, based on http://nl.wikipedia.org/wiki/Lijst_van_alle_Radio_2_Top_2000's (1,189,202 bytes) . These were run from willow.toolserver.org. /* ---------------------- import wikipedia import datetime p_get = wikipedia.Page('nl', "Lijst_van_alle_Radio_2_Top_2000's") p_put = wikipedia.Page('nl', 'Gebruiker:Valhallasw/lange pagina') text = p_get.get() print len(text) text = datetime.datetime.now().isoformat() + "\n\n" + p_get.get() p_put.put(text) ---------------------- */ Under IPv6 (default), the output is the following: /* -------------------- (...snip...) >>> print len(text) 1189202 >>> text = datetime.datetime.now().isoformat() + "\n\n" + p_get.get() >>> p_put.put(text) Sleeping for 3.8 seconds, 2012-06-13 21:50:10 Updating page [[Gebruiker:Valhallasw/lange pagina]] via API <urlopen error timed out> WARNING: Could not open 'http://nl.wikipedia.org/w/api.php'. Maybe the server or your connection is down. Retrying in 1 minutes... -------------------- */ Under IPv4 (with the patch shown below), the output is the following: /* -------------------- (...snip...) >>> print len(text) 1189202 >>> text = datetime.datetime.now().isoformat() + "\n\n" + p_get.get() >>> p_put.put(text) Sleeping for 4.0 seconds, 2012-06-13 21:48:27 Updating page [[Gebruiker:Valhallasw/lange pagina]] via API (302, 'OK', {u'pageid': 2846006, u'title': u'Gebruiker:Valhallasw/lange pagina', u'newtimestamp': u'2012-06-13T21:49:21Z', u'result': u'Success', u'oldrevid': 31455180, u'newrevid': 31455194}) -------------------- */ The hack to test this is the following: Index: families/wikipedia_family.py =================================================================== --- families/wikipedia_family.py (revision 10117) +++ families/wikipedia_family.py (working copy) @@ -44,7 +44,7 @@ if family.config.SSL_connection: self.langs = dict([(lang, None) for lang in self.languages_by_size]) else: - self.langs = dict([(lang, '%s.wikipedia.org' % lang) for lang in self.languages_by_size]) + self.langs = dict([(lang, '91.198.174.225') for lang in self.languages_by_size]) # Override defaults self.namespaces[1]['ja'] = [u'ノート', u'トーク'] Index: wikipedia.py =================================================================== --- wikipedia.py (revision 10117) +++ wikipedia.py (working copy) @@ -5437,6 +5437,7 @@ 'User-agent': useragent, 'Content-Length': str(len(data)), 'Content-type':contentType, + 'Host': 'nl.wikipedia.org', } if cookies: headers['Cookie'] = cookies Index: pywikibot/comms/http.py =================================================================== --- pywikibot/comms/http.py (revision 10117) +++ pywikibot/comms/http.py (working copy) @@ -54,6 +54,7 @@ headers = { 'User-agent': useragent, + 'Host': 'nl.wikipedia.org', #'Accept-Language': config.mylang, #'Accept-Charset': config.textfile_encoding, #'Keep-Alive': '115', Note, however, that this could also be a bug in the python http stack... -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
