"Krinkle" posted a comment on MediaWiki.r110900. URL: http://www.mediawiki.org/wiki/Special:Code/MediaWiki/110900#c30574
Commit summary for MediaWiki.r110900: Revert r110321: introduces an XSS vulnerability because FormatJson::encode() does not prevent the termination of CDATA sections when JavaScript is embedded in HTML. Krinkle's comment: Aggregating from #mediawiki: http://toolserver.org/~mwbot/logs/%23mediawiki/20120208.txt <pre> [00:05:00] <Krinkle> TimStarling: Eh ? that looks worrying? Is this about Services_JSON or php's native json_encode ? [00:05:15] <TimStarling> both [00:06:02] <TimStarling> the issue is documented in Xml::escapeJsString [00:07:37] <Krinkle> TimStarling: So FormatJson::encode should only be used for complete javascript responses (rather than partial output in a <script> tag). [00:07:57] <Krinkle> TimStarling: In that case, do we need FormatJson::encode at all ? Or is it better performance wise when CDATA is not a concern (e.g. in a load.php response) [00:08:33] <TimStarling> both should be secure [00:08:49] <TimStarling> we need multiple lines of defense [00:10:35] <Krinkle> TimStarling: I mean if our (seemingly much simpler) json encoder in Xml.php is better and safer than Services_JSON, we might as well remove all body of FormatJson::encode and re-route it to Xml methods (instead of the other way around). [00:12:40] <TimStarling> PHP 5.3 has a JSON_HEX_TAG option to json_encode() which addresses this issue, as I noted on that bug report [00:13:10] <TimStarling> but yes, it doesn't make sense to have two pure PHP implementations, one secure and one subtly flawed [00:13:24] <Krinkle> Yeah [00:13:40] <TimStarling> I mean, you're one of our best frontend developers, if you don't know about this issue, there's a pretty high chance that it will come up again [00:14:13] <TimStarling> I should review all direct calls to json_encode and FormatJson::encode(), maybe there are vulnerabilities already [00:14:22] <Krinkle> TimStarling: Well, I obviously knew about < and > escapement being a problem when outputted in <script> [00:14:37] <Krinkle> TimStarling: I wrongly assumed it would be taken care of by the encoder we use all over the place [00:15:27] <Krinkle> TimStarling: I know there are no direct alls to json_encode in our codebase (other than FormatJson::encode itself), I ack'ed that before [00:15:55] <Krinkle> There most likely are calls to FormatJson::encode [00:17:01] <Krinkle> TimStarling: So we should raise the requirements before handing off to json_encode to include support for JSON_HEX_TAG, and pass it. And then either fix the php implementation of Services_JSON or let it call Xml.php from FormatJson::encode [00:17:17] <TimStarling> yes [00:17:21] <Krinkle> aight [00:17:21] <TimStarling> most of the code in Services_JSON deals with UTF-8 to UTF-16 conversion [00:18:10] <TimStarling> which we don't need because we use a UTF-8 output format, so our strings can have literal unicode characters [00:18:29] <Krinkle> Aside from using Services_JSON or the Xml-class methods, I'd prefer not having this Xml.php either way, but that's another issue. [00:18:29] <TimStarling> if you were embedding unicode text in a latin-1 HTML document, it would matter [00:18:47] <TimStarling> it can move, I don't mind [00:19:13] <TimStarling> maybe things like Xml::encodeJsCall() should move too [00:21:11] <Krinkle> TimStarling: I'm not too familiar with the encoding backgrounds. I noticed Xml uses \\x3c where JSON_HEX_TAG uses \u003C, that's just the same, right ? [00:21:25] <TimStarling> yeah, it's the same [00:21:27] <Krinkle> k [00:22:51] <Krinkle> TimStarling: Oh, Services_JSON of course also contains decode() which Xml.php does not [00:26:36] <TimStarling> the UTF-8 handling in Services_JSON::decode() is wrong and scary [00:27:41] <TimStarling> it allows overshort characters to break the syntax </pre> _______________________________________________ MediaWiki-CodeReview mailing list mediawiki-coderev...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-codereview