On Thu, Nov 5, 2009 at 4:01 PM, Platonides <[email protected]> wrote: > That's no reason not to warn them ahead of time. Crossposting the notice > to wikibots-l and pywikipedia-l. > > A change like wgHtml5=true seems a good time to change the legacy code > to use the api. > > Please report anything doable with screen-scrapping but not with > mediawiki api.
I don't expect any bots to break, even if they screen-scrape. The actual changes to HTML output are minimal -- a different doctype, a few useless elements and attributes dropped. Unless your script relies on the fact that all <script> elements have type="text/javascript" or something similarly crazy, you should be fine even if you screen-scrape. Switching $wgWellFormedXml to false would certainly break screen-scrapers that rely on XML parsing libraries. However, I don't propose we do this right now -- I'm only suggesting we output HTML5, not cease outputting well-formed XML. Eventually, perhaps, we should consider dropping XML output, especially once HTML5 parsing libraries are more readily available, but not in the foreseeable future. The only breakage I can foresee would be caused by the switch from "almost standards" to "standards" mode. But the difference is pretty minor (far more minor than quirks vs. almost standards), and I don't expect anything to break seriously, or in a way that can't be quickly fixed. No trunk users (TranslateWiki, etc.) have reported any outstanding problems that I know of. _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
