----- Original Message ----- > From: "Mihály Héder" <[email protected]>
> By following this list I hope I gathered how they plan to tackle this > really hard problem: > -a functional decomposition of what the current parser does to a > separate tokenizer, an AST(aka WOM or now just DOM) builder and a > serializer. Also, AST building might be further decomposed to the > builder part and an error handling according to html specs. > -in architecture terms, all this will be a separate component, unlike > the old php parser which is really hard to take out from the rest of > the code. > In this setup there is hope that the tokenizing task can be specified > with a set of rules, thus effectively creating a wikitext tokenizing > standard (already a great leap forward!) > Then the really custom stuff (because wikitext still lacks a formal > grammar) can be encapsulated in AST building. As I noted in a reply I wrote on this thread a few minutes ago (but it was kinda buried): there are between 4 and 7 projects with varying stages of seriosity that are already in work, some of them having posted to this list one or more times. At least a couple of them had as a serious goal producing a formalized, architecturally cleaner parser that could be dropped into Mediawiki. The framing of your reply suggests that you needed to know that and didn't. Cheers, -- jra -- Jay R. Ashworth Baylink [email protected] Designer The Things I Think RFC 2100 Ashworth & Associates http://baylink.pitas.com 2000 Land Rover DII St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274 _______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
