Chad wrote: > > To elaborate on the final point. Sometimes the parser is changed and > it breaks output on purpose. Case in point was when Tim rewrote the > preprocessor. Some parts of syntax were intentionally changed. You'd > have to establish a new baseline for this new behavior at that point. > > This also comes down to the fact that we don't have a formal grammar > for wikisyntax (basically it's whatever the Parser says it is at any given > time). This makes testing the parser hard--we can only give it input and > expected output, there's no standard to check against. > > Finally, I don't think we need to dump all of enwiki. It can't require that > much content to describe the various combinations of wiki syntax...
In principle, I rather like the idea of using the entire English Wikipedia (or why limit to that? we have plenty of other projects too) as a parser test, or at least of having the ability to do that if we want. You see, the flip side to not having a formal grammar for wikimarkup is that we also don't have a spec sheet for it: the best description of how people actually expect the parser to behave and what features they expect it to support is what they're actually using it for on their wikis. And en.wikipedia is the biggest and ugliest of the bunch. There's no way we can ever write a test suite comprehensive enough to cover every single feature, bug, quirk and coincidence that actual wiki pages and templates may have come to rely on. That's simply because for every MediaWiki coder there are dozens or hundreds of template writers and thousands of other editors. In a way, all those editors form the biggest, most thorough fuzz tester there can be. The only problem is that it's also a rather inefficient one, even for a fuzz tester: most wiki pages exercise only a fairly small and boring set of parser features. But at least, if one were to, say, run a random sample of a few thousand Wikipedia pages through the parser and observe no unexpected changes in the output, one could start to make some statistical predictions about how many of the remaining pages one could at worst expect to break. The real problem, as noted elsewhere in the thread, is of course filtering the unexpected changes from any expected ones. A partial solution could be having the test implementation extract the changes -- we conveniently have a word-level diff implementation available already -- and combining any duplicates. Another, complementary approach would be to allow the person running the tests to postprocess the two outputs before they're compared, so as to try and eliminate any expected differences. Of course, this would require some significant extra effort on the part of that person, beyond just typing "php runSomeTests.php" and hitting enter, but then again, throughly analyzing the effects of a major parser change is a nontrivial exercise anyway, no matter what. And for things that _shouldn't_ cause any changes to the parser output, it really could be just as easy, in principle at least, as running parserTests currently is. -- Ilmari Karonen _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
