On Mon, Jul 11, 2011 at 4:46 PM, Erik Rose <[email protected]> wrote: > Say, while everybody's trying to figure out a formal grammar, have you had > a look at Ward Cunningham's exploratory parsing kit? He gave me a demo at > OSBridge, and it's a really handy tool. Basically, it's a web app with an > asynchronous C backend. You paste a tentative PEG grammar into a textarea, > and it runs through whatever corpus you want, showing you representative > instances of how it does or does not match. He was running it against the > full English Wikipedia on his laptop, and it took only half an hour or > something—with results coming in as they were generated, of course.
> Using that, they made a PEG-and-then-some implementation of MW syntax that > parses darn near all of Wikipedia: > https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg. (I call it > "PEG-and-then-some" because it does have a lot of callbacks which might > interlock with and affect the rule matching—not sure.) > It is indeed dang impressive -- I expect to be stealing at least some of those grammar rules. :) We are however producing a different sort of intermediate structure rather than going straight to HTML output, so things won't be an exact match (especially where we do template stuff). -- brion
_______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
