https://bugzilla.wikimedia.org/show_bug.cgi?id=27478
--- Comment #13 from Aryeh Gregor <[email protected]> 2011-02-24 18:12:08 UTC --- (In reply to comment #11) > You explained exactly the same error with scraping in 2009: > [[Wikipedia:Village pump (technical)/Archive 67#Twinkle stalling]] > > Also bug 27672 was filed yesterday. This suggests maybe some named entities have crept through, or some other type of well-formedness. It would be nice if people said which exact pages failed, but it would probably be possible to figure it out. I'm guessing it's the result of messages being passed as raw HTML and sysops adding named entities to them, but it could be something else too. The easy way out would be to restore the old hack where we serve HTML5 with an HTML 4.01 Strict doctype, which is valid HTML5 but rather confusing. This is how 1.16 works by default. That way a DTD is specified, which means that non-browser UAs will parse named entities successfully. We can consider switching back to the HTML5 doctype later. (In reply to comment #12) > Several problems on enwiki were caused by the difference in Sanitize::escapeId > between HTML4 and HTML5 modes. Hmm. This should be disable-able by setting $wgExperimentalHtmlIds to false, leaving $wgHtml5 true (which might leave well-formedness issues). A proper fix will require some more thought, though. The changes to escapeId() are really meant for headings, but we can't realistically distinguish wikilinks meant to point at headings from wikilinks meant to point at other things. In practice, it looks like Cite is the major problem here (with the id's), and it can probably be fixed. My first inclination is to just generate arbitrary id's for named refs instead of trying to key off the names. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
