"Jay Ashworth" <[email protected]> wrote in message news:[email protected]... > ----- Original Message ----- > > The thing you want expanded, George, is "Last Five Percent"; I refer > there to (I think it was) David Gerard's comment earlier that the > first 95% of wikisyntax fits reasonably well into current parser > building frameworks, and the last 5% causes well adjusted programmers > to consider heroin... or something like that. :-) > > The argument advanced was always "there's too much usage of that ugly > stuff to consider Just Not Supporting It" and I always asked whether > anyone with larger computers than me had ever extracted actual statistics, > and no one ever answered.
This is a key point. Every other parser discussion has floundered *before* the stage of saying "here is a working parser which does *something* interesting, now we can see how it behaves". Everyone before has got to that last 5% and said "I can't make this work; I can do *this* which is kinda similar, but when you combine it with *this* and *that* and *the other* we're now in a totally different set of edge cases". And stopped there. Obviously it's impossible to quantify all the edge cases of the current parser *because of* the lack of a schema, but until we actually get a new parser churning through real wikitext, we're blind in the dark to say whether those edge cases make up 5%, 0.5% or 50% of the corpus that's out there. --HM _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
