[cross-posted] ----- Original Message ----- > From: "Mark A. Hershberger" <[email protected]>
> I suppose these are all linked to the parser work that Brion & co are > currently working on, but the arrival of the new parser 6 months to a > year or more away (http://www.mediawiki.org/wiki/Future/Parser_plan ), > I'd like to get these sort of parser issues sorted out now. My particular hobby horse, the last time that {wikitext-l was really active, I was involved with it heavily} (those are nearly identical, but not quite) was this question, which that wiki page does not seem to address, but the Etherpad might. If not, I still think it's a question that's fundamental to the implementation of a replacement parser, so I'm going to ask it again so everyone's thinking about it as work progresses down that path: How good is good enough? How many pages is a replacement parser allowed to break, and still be certified? That is: what is the *real* spec for mediawikitext? If we say "the formal grammar", then we are *guaranteed* to break some articles. That's the "Right Answer", from up here at 40,000 feet, where I watch from (having the luxury of not being responsible in any way for any of this :-), but it will involve breaking some eggs. I bring this back up because, the last time we had this conversation, the answer was "nope; the new parser will have to be bug-for-bug compatible with the current one". Or something pretty close to that. I just think this is a question -- and answer -- that people should be slowly internalizing as we proceed down this path. 1) Formal Spec 2) Multiple Implementations 3) Test Suite I don't think it's completely unreasonable that we might have a way to grind articles against the current parser, and each new parser, and diff the output. Something like that's the only way I can see that we *will* be able to tell how close new parsers come, and on which constructs they break (not that this means that I think The Wikipedia Corpus constitutes a valid Test Suite :-). Cheers, -- jra -- Jay R. Ashworth Baylink [email protected] Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274 _______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
