2011-05-03 02:38, Chad skrev: > On Mon, May 2, 2011 at 8:28 PM, Tim Starling <[email protected]> wrote: >> I know that there is a camp of data reusers who like to write their >> own parsers. I think there are more people who have written a wikitext >> parser from scratch than have contributed even a small change to the >> MediaWiki core parser. They have a lot of influence, because they go >> to conferences and ask for things face-to-face. >> >> Now that we have HipHop support, we have the ability to turn >> MediaWiki's core parser into a fast, reusable library. The performance >> reasons for limiting the amount of abstraction in the core parser will >> disappear. How many wikitext parsers does the world really need? >> > > People want to write their own parsers because they don't want to use PHP. > They want to parse in C, Java, Ruby, Python, Perl, Assembly and every > other language other than the one that it wasn't written in. There's this, > IMHO, > misplaced belief that "standardizing" the parser or markup would put us in a > world of unicorns and rainbows where people can write their own parsers on > a whim, just because they can. Other than "making it easier to integrate with > my project," I don't see a need for them either (and tbh, the endless > discussions grow tedious).
My motivation for attacking the task of creating a wikitext parser is, aside from it being an interesting problem, a genuin concern for the fact that such a large body of data is encoded in such a vaguely specified format. > I don't see any problem with keeping the parser in PHP, and as you point out > with HipHop support on the not-too-distant horizon the complaints about > performance with Zend will largely evaporate. But most of the parser's work consists of running regexp pattern matching over the article text, doesn't it? Regexp pattern matching are implemented by native functions. Does the Zend engine have a slow regexp implementation? I would have guessed that the main reason that the parser is slow is the algorithm, not its implementation. Best Regards, Andreas Jonsson _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
