Thanks for the info! I have been unable to figure out the right place to interact with the parser, though. As far as I can see, there are no hooks between calling the parser and calling linksupdate, and the hooks which are internal to the parser have no knowledge of what they are parsing: the main wikitext or some random interface message. That's fine for extensions which use tracking categories to trace when something is broken, but I am trying to find out when something is missing; the logic would be triggered on every interface message or section preview or whatever.
Short of adding a new hook (to the end of Content::getParserOutput maybe), I don't see how this could be done. On Mon, Sep 15, 2014 at 3:03 AM, Brian Wolff <[email protected]> wrote: > On 9/14/14, Gergo Tisza <[email protected]> wrote: > > Hi, > > > > I would like to flag a large number of wiki pages based on whether their > > HTML passes a certain test, so that failing pages can be easily listed > and > > counted. The flags should adapt when pages are created or modified. (The > > specific use case is collecting file pages which do not have > > machine-readable author and license information embedded.) > > > > I have been thinking of adding such pages to a maintenance category from > a > > parser hook (the test logic is already part of the imageinfo/extmetadata > > API and would be easy to reuse), is that a good way to do this? If so, > > what's the best way to achieve it? Is it OK to just add categories as > > needed via $parser->getOutput()->addCategory() or can that mess up > internal > > state such as the categorylinks table? > > > > Alternatively, the Cite extension just parses and appends a message to > the > > end of the text on ParserBeforeTidy when it encounters an error, and the > > message contains wikitext to include a category. That seems like a clever > > way of maintaining flexibility so it is easy to change the category name > or > > add extra text for a call to action without any need for a code change. > Is > > that approach safe/cheap? > > > > thanks > > Gergő > > _______________________________________________ > > Wikitech-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > There's two ways that this is usually done, either page_props table, > or tracking categories. Provided the hook you use runs before > linksupdate (which is any hook in the parser), you should be fine in > adding such things. > > To add a page property, you would do something like > $parser->getOutput()->setProperty( 'prop name', 'optionally some extra > arbitrary data' ); > > > Pages can be found via Special:PagesWithProp or direct db query. > > To add a tracking category: > $parser->addTrackingCategory( 'tracking cat name' ); > > > You also have to define a message for the tracking category name and a > description message, add it to $wgTrackingCategories. See the code > docs for Parser::addTrackingCategory and $wgTrackingCategories. > > Generally page props are used for obscure things that a user is > unlikely to care about or cases where you need special cache > invalidation behaviour on change (there's special support for that > with $wgPagePropLinkInvalidations), where tracking categories are more > properties the user is interested in. Its possible to also make the > tracking category by off by default until users turn it "on" by > editing a mediawiki namespace page by making the category name defualt > to '-'. > > In the use case you describe I think tracking category is more suited. > > --bawolff > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
