I'm probably mis-remembering that... I probably was the one disappointed in it being a translation to HTML. Still I understand why you did it that way.
It's kind of amazing how we all have these projects we call parsers, and then they all do completely different things. :) On 7/11/11 11:01 PM, Karl Matthias wrote: > I'm surprised, Neil, > that you think Ward was disappointed with this as he was always > supportive of our efforts and indeed introduced us to Peg and spent some > time helping us get into writing grammars and understanding the > pitfalls. I'm sorry it doesn't solve the problem you guys have off the > shelf, but hopefully it helps open some doors, or at least serves as a > model of how a grammar can be written. > > If I can be of help, please just give me a shout. > > Cheers, > Karl > > > On Tue, Jul 12, 2011 at 4:35 AM, Neil Kandalgaonkar <[email protected] > <mailto:[email protected]>> wrote: > > Trevor & I talked with him extensively about this. BTW, around here, > he's just Ward. :) > > He too was disappointed that his team wrote rules to directly transform > wikitext into HTML. > > The parse-everything-in-Wikipedia thing isn't quite what it sounds like. > If I recall correctly it works like this: > > As part of his job at About.us, he was really looking for patterns of > Wikitext that he could use to snag business information. One target was > the Infobox on Wikipedia. So, the tool was a way of cataloging the > various ways that people structure an Infobox template. > > Because he wrote this in C, he added rules to the grammar to discard > information in favor of keeping a data structure of constant size. > That's mostly what the the <<< >>> in the grammar mean. Anyway, this > then serves as a sampling of the majority of the structures one is > interested in. The more rules you write, the more "unknown" stuff falls > into the fixed size of structures that are unparsed. IIRC he agreed it > might not be so useful if you were writing a grammar for PHP or JS (I > assume the same is true for Python). > > > > On 7/11/11 5:24 PM, Erik Rose wrote: > > On Jul 11, 2011, at 5:17 PM, Brion Vibber wrote: > > > We are however producing a different sort of intermediate > structure rather than going straight to HTML output, so things won't > be an exact match (especially where we do template stuff). > > > > Nor are we going straight to HTML, which is one reason we didn't > steal this stuff. :-) > > _______________________________________________ > > Wikitext-l mailing list > > [email protected] <mailto:[email protected]> > > https://lists.wikimedia.org/mailman/listinfo/wikitext-l > > -- > Neil Kandalgaonkar |) <[email protected] > <mailto:[email protected]>> > > _______________________________________________ > Wikitext-l mailing list > [email protected] <mailto:[email protected]> > https://lists.wikimedia.org/mailman/listinfo/wikitext-l > > > > > _______________________________________________ > Wikitext-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitext-l -- Neil Kandalgaonkar |) <[email protected]> _______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
