--- On Wed, 8/12/09, Roan Kattouw <roan.katt...@gmail.com> wrote:

> I read this paragraph first, then read the paragraph above
> and
> couldn't help saying "WHAT?!?". Using a huge set of pages
> is a poor
> replacement for decent tests.

I am not proposing that the CPRT be a substitute for "decent tests." We still 
need a a good set of tests for the whole MW product (not just the parser). Nor 
would I recommend making a change to the parser and then immediately running 
the CPRT. Any developer that isn't masochistic would first run the existing 
parserTests and ensure it passes. Then, you probably want to run the modified 
DumpHTML against a small random selection of pages in the WP DB. Only if it 
passes those tests would you then run the CPRT for final assurance. 

The CPRT I am proposing is about as good a test of the parser that I can think 
of. If a change to the parser passes it using the Wikipedia database (currently 
5 GB), then I would say for all practical purposes the changes made to the 
parser do not regress it.

> Also, how would you handle
> intentional
> changes to the parser output, especially when they're
> non-trivial?

I don't understand this point. Would you elaborate?

Dan


      

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to