On Wed, 11 Apr 2012 23:33:53 -0700, Roan Kattouw <[email protected]> wrote:

On Apr 11, 2012 11:01 PM, "Antoine Musso" <[email protected]> wrote:
const EXT_URL_REGEX =
'/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+

)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&amp;?

([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';

ZOMG. Anyway, what I'd do if I had a MediaWiki clone handy is look through
Sanitizer.php to see if there's anything in there that handles URLs.

Roan

There's always gitweb. For the de-jure standard repo viewer its urls and navigation is awful (though that's not git's fault, as you can see by some of the other repo viewers that do it better) but it works.
All we've got in Sanitizer is href="" handling in attribute sanitization.
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/Sanitizer.php;h=a2459c43b5920ee414746aa5053d451d52f04861;hb=HEAD#l750
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/Sanitizer.php;h=a2459c43b5920ee414746aa5053d451d52f04861;hb=HEAD#l793

This case should really be handled by checking against wfUrlProtocols. And then anything that doesn't match gets sent thorough Title::newFromText. And anything that further causes Title to return null/false should be ignored.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to