Re: [Wikitext-l] Cunningham's exploratory parsing

Neil Kandalgaonkar Mon, 11 Jul 2011 23:40:26 -0700

I'm probably mis-remembering that... I probably was the one disappointed 
in it being a translation to HTML. Still I understand why you did it 
that way.


It's kind of amazing how we all have these projects we call parsers, and 
then they all do completely different things. :)


On 7/11/11 11:01 PM, Karl Matthias wrote:

>  I'm surprised, Neil,
> that you think Ward was disappointed with this as he was always
> supportive of our efforts and indeed introduced us to Peg and spent some
> time helping us get into writing grammars and understanding the
> pitfalls.  I'm sorry it doesn't solve the problem you guys have off the
> shelf, but hopefully it helps open some doors, or at least serves as a
> model of how a grammar can be written.
>
> If I can be of help, please just give me a shout.
>
> Cheers,
> Karl
>
>
> On Tue, Jul 12, 2011 at 4:35 AM, Neil Kandalgaonkar <[email protected]
> <mailto:[email protected]>> wrote:
>
>     Trevor & I talked with him extensively about this. BTW, around here,
>     he's just Ward. :)
>
>     He too was disappointed that his team wrote rules to directly transform
>     wikitext into HTML.
>
>     The parse-everything-in-Wikipedia thing isn't quite what it sounds like.
>     If I recall correctly it works like this:
>
>     As part of his job at About.us, he was really looking for patterns of
>     Wikitext that he could use to snag business information. One target was
>     the Infobox on Wikipedia. So, the tool was a way of cataloging the
>     various ways that people structure an Infobox template.
>
>     Because he wrote this in C, he added rules to the grammar to discard
>     information in favor of keeping a data structure of constant size.
>     That's mostly what the the <<< >>> in the grammar mean. Anyway, this
>     then serves as a sampling of the majority of the structures one is
>     interested in. The more rules you write, the more "unknown" stuff falls
>     into the fixed size of structures that are unparsed. IIRC he agreed it
>     might not be so useful if you were writing a grammar for PHP or JS (I
>     assume the same is true for Python).
>
>
>
>     On 7/11/11 5:24 PM, Erik Rose wrote:
>     >  On Jul 11, 2011, at 5:17 PM, Brion Vibber wrote:
>     > > We are however producing a different sort of intermediate
>     structure rather than going straight to HTML output, so things won't
>     be an exact match (especially where we do template stuff).
>     >
>     >  Nor are we going straight to HTML, which is one reason we didn't
>     steal this stuff. :-)
>     >  _______________________________________________
>     >  Wikitext-l mailing list
>     >  [email protected] <mailto:[email protected]>
>     >  https://lists.wikimedia.org/mailman/listinfo/wikitext-l
>
>     --
>     Neil Kandalgaonkar  |) <[email protected]
>     <mailto:[email protected]>>
>
>     _______________________________________________
>     Wikitext-l mailing list
>     [email protected] <mailto:[email protected]>
>     https://lists.wikimedia.org/mailman/listinfo/wikitext-l
>
>
>
>
> _______________________________________________
> Wikitext-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitext-l

-- 
Neil Kandalgaonkar  |) <[email protected]>

_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Re: [Wikitext-l] Cunningham's exploratory parsing

Reply via email to