On Mon, Jul 11, 2011 at 4:46 PM, Erik Rose <[email protected]> wrote:

> Say, while everybody's trying to figure out a formal grammar, have you had
> a look at Ward Cunningham's exploratory parsing kit? He gave me a demo at
> OSBridge, and it's a really handy tool. Basically, it's a web app with an
> asynchronous C backend. You paste a tentative PEG grammar into a textarea,
> and it runs through whatever corpus you want, showing you representative
> instances of how it does or does not match. He was running it against the
> full English Wikipedia on his laptop, and it took only half an hour or
> something—with results coming in as they were generated, of course.


> Using that, they made a PEG-and-then-some implementation of MW syntax that
> parses darn near all of Wikipedia:
> https://github.com/AboutUs/kiwi/blob/master/src/syntax.leg. (I call it
> "PEG-and-then-some" because it does have a lot of callbacks which might
> interlock with and affect the rule matching—not sure.)
>

It is indeed dang impressive -- I expect to be stealing at least some of
those grammar rules. :)

We are however producing a different sort of intermediate structure rather
than going straight to HTML output, so things won't be an exact match
(especially where we do template stuff).

-- brion
_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Reply via email to